Is Big Data Getting Too Big for the Environment?

I’ve been trying to find some solid data to answer the following question:

What is the environmental impact of the “Big Data“?

Let me explain…

Collectively, internet search companies have been relying on advertising dollars to make themselves profitable while keeping internet search free for consumers.  This, however, meant that compared “traditional” advertising models, search providers needed to make internet advertising much more economically viable for those companies paying for ads. In other words, they needed to improve ads targeting, which they did and are doing today very successfully.  But those models and algorithms, as scientifically amazing as they are, rely on massive amounts of data being collected about our site visits, our clicks, our searches, our behavior online.  All this data needs to be stored and processed somewhere. Indeed, the world is now running tens of millions of servers (estimated at 44 million). Significant portion of these servers (how big of a portion?) is dedicated to storing and processing search logs and clicks data directly related to online advertising. And all of those servers consume energy, a lot of energy (estimated at 0.5% of all electricity) and allegedly create significantly negative impact on the environment.

Yes, I know that data centers engineering and management science together with software and hardware manufactures made amazing progress when it comes to data center and software efficiencybut this is not the point of this post.

Regardless of how you feel about anthropogenic causes of global warming or about internet privacy, the fact is that we are still burning a lot (how much?) of energy… to do what exactly?

How much data do we need to index the web and make it helpful for all kinds of web queries? How fast does this portion of the data grow? How does it compare with volumes of data needed to continuously infer better and better ads targeting based on our behavior online and our social connections? In absolute terms per product sold, is environmental impact of online advertising better or worse than environmental impact of traditional advertising models? Can we get our hands on this statistical data? Is it a worthwile undertaking to understand it better?

In addition to some opinions out there about the questionable usefulness of online advertising, does online advertising basically helping us to “kill the planet”? Or is it as efficient environmentally as it seems to be economically? Are trends in data storage and processing sustainable with regards to the required energy production to support it? Should “Big Data” players bridge their considerable differences and collaborate on efficient and clean power production and distribution technologies with the same rigor they work on their own massively parallel data processing techniques?

I don’t know the answer to these questions – I am still searching for reliable data and useful models to estimate this.  If you have any suggestions – please feel free to reply to this post. I would be happy to collaborate!

Developer-Driven Is Not A Panacea

Idea of developer-driven or developer-centric organizations is being actively discuss lately thanks to the likes of Facebook. Some people, mostly developers, love it for its freedom and lack of heavy processes. I love it too – to a point. I believe in its usefulness while a product is being conceived, while it is in the prototype or even in a rapid growth stage, while creators of the product have neither rich user data nor significant market penetration. In these situations, product team does not have any data to guide their development direction, and therefore tries to probe with multiple features.  However, there are obvious downfalls even at this stage:

  • Developing cool new features before core functionality is in place is meaningless at best and distracting you from your mission at worst
  • Developers left to their own devices tend to over-engineer most of the features and create flexible frameworks for everything, even trivial things
  • Most developers suffer from NIH syndrome, which leads to everything being written at least twice if not more

Let’s assume now that the product is becoming successful. It is growing in size, gaining more and more fans and/or more and more revenue. At this stage blindly following Developer-Driven process can get you into even more trouble.

Business:

  • New features developers chose to implement may not be helping your product
  • New features developers chose to implement may not be what your users want most or want at all
  • New features developers chose to implement may be cannibalizing or hurting other important revenue-generating features
  • New features your developers chose to implement may not be the most profitable

Technical:

  • Duplicated code grows all over the place and when new features are being introduced, raging debates start about what to use and what to retire and why. There is no consensus, so different factions continue to use and improve their code of choice ignoring others.
  • Set of home-grown tools emerges that mostly work, but nobody is completely happy with them, so from time to time somebody creates new tools that do almost the same thing but “better”. Maintenance costs continue to grow…
  • Documentation becomes so far outdated that knowledge about overall system architecture is priceless and can only be found with people who have been on the team since its inception.
  • Build takes forever because of the amount of code. Configuration is mind-boggling because it is hard to configure dozens of “data-driven” frameworks to actually do what users want.

I could continue this list for a while, you know…

So what’s the solution? In my mind, it is very simple – provide a system of lightweight checks and balances that will keep the good and eliminate the bad:

  1. Get a good PM on the team. Give them broad authority to influence and little authority to dictate. Make them work together with Dev and Test teams. Good PMs hate process just as much as Developers, and they will provide an excellent balance against cowboy development and over-engineering problems. More importantly, this move will compliment technology focus of Dev-driven product with razor-sharp customer focus.
  2. Get architects to actually do their job. Architects who design how features should work should also be capable enough to quickly code up reusable and complicated components and provide actionable feedback and oversight to other team members. However, architects who sit in an ivory tower are just as bad for the product as no architecture at all. Therefore, getting seasoned developers to become architects (i.e. experts) on the product is a better way to go, I think.
  3. Use experimentation in addition to (or instead of) traditional marketing research. Once a product get beyond initial rollout stages and starts gaining steam, the only way to grow effectively is to use data, quickly try out new features, measure, and if ideas fail (most of them do, by the way) – fail early and get out.
  4. Watch over the buildavoid having dedicated build team, keep number of branches to a minimum, make developers responsible for integrating code. This will serve as a forcing function to have a reasonable component architecture and inject enough engineering discipline.
  5. Do not subscribe to “good developers write bug free code” myth, invest in test infrastructure, measure test coverage. Once again, placing test responsibility with Developers can be used as a forcing function to inject stability and reason into your code base.

In short, maintain user-centric focus, keep it simple as much as possible, use real data, trust but verify.

How To: Connect TFS-bound Excel to Another Team Project

If you made any changes to TFS-bound Excel Workbook that come pre-installed with your Team Project, you might naturally want to share them with your colleagues who run other projects in your organization. For example, we created a souped-up version of Iteration Backlog that computes our Team Velocity, reports on User Story progress and breaks up work between Developers and Testers. Making these changes available in the Team Project Template is useful for any Team Projects created in the future, but it does not help with existing Team Projects that might benefit from the same improvements. If the changes are significant, it might be hard and time-consuming to edit Iteration Backlog in another Team Project and we would naturally want to just take improved Iteration Backlog and “rebind it” to a new Team Project, in the same or in a different Team Project Collection or even Team Foundation Server instance.

Unfortunately, attempting to reconnect to another Team Project using “Configure Server Connection” option in the Team Foundation Ribbon will not work because it requires the same project collection and it primarily designed to work around backup/restore scenarios:

Configure Server Connection

Configure Server Connection - Error

We will have to apply a workaround solution, which is the following:

1. Save a copy of original Excel Workbook and close it. Disable Team Foundation Add-In. Close Excel.

To disable TFS Add-in, go to File->Options->Add-Ins. Select COM Add-Ins in the Manage drop down and click GO:

Excel - Com Add-Ins

In the dialog that shows up, uncheck Team Foundation Add-In and click OK:

Disable TFS Add-In

2. Open the workbook and remove all custom document properties from the Workbook. These properties contain TFS binding information TFS Add-in will automatically recreate them when reconnecting to another Team Project. Not disabling Team Found add-in seems to allow it to re-create these properties even when saving the document.

This is a bit tricky and tedious, and you might want to use VBA for this. Click Alt+ F11, in the VBA Editor enter, and execute the following Macro:

Sub DeleteCustomProperties()
    Dim p As DocumentProperty
    For Each p In ActiveWorkbook.CustomDocumentProperties
        p.Delete()
    Next p
End Sub

3. Remove hidden verification worksheet. Save and close the Workbook.

This is also done in Visual Basic Editor because that page is hidden from the view. In the Project View in the Visual Basic, select a worksheet with name starting from “VSTS_ValidationWS” and in the Properties Editor change Visible property to xlSheetVisible:

Hidden Validation Sheet

After that, delete VSTS Validation Worksheet in Excel:

Delete Hidden Validation Sheet

4. Re-enable TFS Add-in using the process opposite to the one described above. Open the Workbook again. Notice that when you navigate to the worksheet containing TFS-bound table regions, Team Ribbon does not activate Refresh button because it no longer knows that your workbook in connected to the TFS:

New TFS-Bound List

5. Make sure you do not delete the TFS-bound regions completely because if you do, your formulas will be damaged beyond repair. Keep at least one line from the old table and make sure you copy the name of the table to a text document somewhere – we will need it later:

Name of the Old List

6. Using New List button located in Team Ribbon, add another TFS-bound list that is connected to the same query in the destination Team Project and make sure column layout matches existing list. You will end up with two similar looking lists – one (with one remaining row) from old Team Project and one from the new Team Project:

New TFS List Added

7. Now use the old table name and new table name to adjust all formulas. Verify that formulas work as designed.

One again, this might be easier to accomplish with Visual Basic. Knowing old table name from step 5 and new table name from step 6, run the following macro (it might take a while to execute):

Sub ReplaceFormulas(ByVal originalListName As String,_
                    ByVal newListName As String)
    On Error Resume Next
    Dim w As Worksheet, i As Long, j As Long, _
        formulaValue As String
    For Each w In ActiveWorkbook.Worksheets
        For i = 1 To w.UsedRange.rows.Count
            For j = 1 To w.UsedRange.Columns.Count
                formulaValue = w.Cells(i, j).formula
                If formulaValue <> "" Then
                    formulaValue = Replace(formulaValue, _
                     originalListName, newListName, , , vbTextCompare)
                   w.Cells(i, j).formula = formulaValue
                End If
            Next j
        Next i
     Next w
End Sub

Sub ReplaceAll()
    Dim calc As XlCalculation
    Application.ScreenUpdating = False
    calc = Application.Calculation
    Application.Calculation = xlCalculationManual
    ReplaceFormulas("VSTS_8eeef3b2_74af_457d_88e8_0fce8bc5d131", _
                    "VSTS_d3ffaeb7_2a52_4d3e_afd8_2d7334bb47bb")
    Application.ScreenUpdating = True
    Application.Calculation = calc
End Sub

8. Remove old tables. The easiest way to accomplish this is to first convert it to a Range and then by deleting rows from the Worksheet:

Convert Table to Range

9. Save modified Excel Workbook.

Note: If you have Pivot Table objects pointing to the old tables, you may need to augment ReplaceFormulas macro with updates to PivotTable objects – their Source Data needs to be updated.

Hope that helps!

Karma? I think not

Recently I’ve started interacting with some people in Seattle startup community. Honestly, I half expected to find a bunch of antisocial geeks who troll about their employers and exchange “check out my new killer social entertainment network” messages. To my amazement, I found a really vibrant community of smart individuals who are willingly helping each other with useful information, references, ideas, and treat each other (mostly) with enormous respect.  These people have nothing to gain from giving out their advise or sharing their valuable ideas. On the contrary, somebody somewhere might benefit from this information to make their endeavor more successful, to get better funding, to hire better engineer, to meet interesting people.

So I started thinking about why this happens in this age of cutthroat competition. I don’t think it is Karma or “Pay it Forward“, though for some people out there that is a legitimate and commendable driver for this behavior. Instead, I think that it is a basic drive to look for and find better social interaction in our professional lives. Plus a lot of networking, of course. I firmly believe based on my own experiences that we strive to be accepted professionally, and these loosely coupled peer-to-peer communities satisfy that need. People treat each other as equals, assume the best instead of the worst, listen, share their advice and help because they like to do it. It gives them a great feeling of self-worth, especially when they get the same kind of treatment from the others.  Notice that this is slightly (and in some cases drastically) different from the corporate world. First, no matter how senior your position is within your company, you are constrained by its reporting structure and by its culture. In some situations you are treated as a resource, and in some situations your opinion or ideas have hard time reaching the right audience. In these startup communities, however, things are different because these constraints don’t exist. So meeting somebody for coffee and chatting with them about your and their ideas is a much simpler proposition.

So, in light of these observations I’ve made a resolution this year – to help my friends in their crazy ideas when they need my help, or advise, or a sounding board, or all three. Just to have a professional environment outside of work that is slightly different from what I have at work. Just to geek out with my buddies on my free time while feeling important. Just to be free for a few hours every week. Ha!

Symptoms of Sick Projects

I recently got a new project assignment at work and started thinking about levels of uncertainty that exists in every software project. It’s been in the back of my mind a lot lately as I am trying to figure out what is that we are trying to solve, how our new project is going to work, what do we need to do technically and organizationally and how can I make it bring value in the context of larger division and even the company.

First, I know that uncertainty is inevitable and it is basically good. Without uncertainty, there won’t be any room for initiative, innovation, and risk taking.  However, too much uncertainty typically results from other bad symptoms that bring enough distraction to make me worry and question the project’s validity.  I’ve been trying to classify these most telling symptoms, and find ways to treat them. I am purposely staying away from management styles, social dynamics, reporting hierarchies on a project or even the project architecture, though these can have plenty of problems as well.  I am thinking about basic, fundamental issues that interfere with way to approach all technology projects: find what the problem is and proceed to create a technology that will solve it.

Here is what I could come up with, and I’ve been:

The NASA Pen Project

This a project where everybody assumes that problem at hand can only be solved with a new technology and diligently and stubbornly tries to come up with such a solution. Anybody who dares to suggest that the problem can be solved by using an existing technology (potentially mixed with some skillful organization changes) is immediately dismissed because… Nobody can tell why.  This creates a situation when people who are actually doing the work can’t fathom why or how their work might be useful because they all know that existing technology is just as good or even better. This uncertainty over the project’s fate causes quality issues, not to mention the team morale, because people think that their time and their work is being wasted. The only effective way I found to resolve this is to bring data to compare the projected costs of developing and implementing new technology with likely much lower costs of using existing tools to solve the problem and to propose a set of organization changes that will remove existing barriers.

The New Bicycle Project

This is classic and rampant NIH symptom. Because of political pressure within an organization or because of artificial competition and meaningless disagreements between “big wigs” in multiple divisions, an emerging or even an existing technology is reinvented with various degrees of success over and over and over again. Everybody has fun working on the cool new thing on their own, but company resources needed to solve the same problem are doubled or tripled. Nothing or very little is shared, teams hate each other, team management tries to outsmart their peers in the eyes of executives, and in the end nobody achieves the level of success they could have achieved had they worked together. Two-pronged attack might help here, I think: relentless pressure from the bottom (people demanding increased reuse and sharing of products already under way) and smart management from the top (timely executive decisions to combine or eliminate duplicate efforts).

The Emperor’s New Clothes Project

This is hysterical to observe from the outside but it is very painful for the people involved. This is a project where everybody seems to know and understand what the real problem is, but instead of solving it directly, team seems to be busy creating technology to work around it. After a few of those, a technological monstrosity/hodge-podge emerges around the original problem at hand, and it is usually unmanageable, unmaintainable, does not make things better at all, and is very costly. Once again, this could be due to politics within an organization or due to entrenched and seemingly unsolvable nature of the main problem (which sometimes is not technological at all). But until somebody has the guts to become that child who yelled that the emperor is naked, things will continue to be very difficult and the original problem will become even more entrenched because now there is a large investment into this pseudo-solution.

Go I Know Not Whither and Fetch I Know Not What Project

This is uncertainty squared – nobody seems to know what the problem is or was to begin with, nobody seems to understand why an existing solution, if any, is inadequate, and nobody can describe in concrete terms their vision for the new solution. But the plans are already approved, money allocated, and your team is on the hook to deliver… something. The solution likely lies in peeling the onion and asking a lot of questions from a lot of people. Start challenging every assumption to see how sure people are that there is a problem. Find out its exact nature to avoid building a space pen. Make sure you reuse existing bicycles. Call the emperor naked if you have to. Do whatever you have to to define “whither” and “what”.  And if you can’t make any headway, be aware that there is a chance that you got this fairy tale project because somebody wanted to remove you from the picture for a while or permanently…

Am I being too negative? Well, I don’t think so – I am an optimist. This is why I am doing my very best to prevent my new project from becoming one of these and I am beginning to like this challenge more and more.

Happy New Year!

Creating Sustainable Internal Startups

According to the Wikipedia, sustainability is the capacity to endure.  In recent years, the term has been used almost exclusively in ecological sense, but I am going to stick with the simple definition: capacity to endure.

Within every large self-respectful company, there is an R&D organization that targets long-term aspirations, works on crazy projects that may or may not pan out, makes lots of mistakes and sucks up a lot of company money. Executives accept that fact because entrenched R&D organizations provide future foundation for the company by doing both fundamental and applied research that is relevant to the company’s aspirations, even if in the short-term they have nothing to do with the problems at hand and the only thing that a company gets from their R&D folks next quarter is most likely a huge bill. I am not going to argue with this status quo because I think that just like fundamental scientific research done at universities, corporate R&D is basically a good thing.

What I wanted to talk about is so-called “Labs” projects that spring up from time to time in addition to primary research organizations.  Those guys act like startups – they are typically small driven teams that have a brilliant idea (or so they think at least) and that are lucky enough to have at least one executive who is willing to support them initially and let them work on that idea while protected them and supporting them by the money from corporate coffins. Sooner or later, however, this good will always ends. Always. Why? Because the fundamental goal of every corporation is to make profit and as far as crazy research goes, they already have that going on in their R&D department.  I would estimate this grace period at maybe 2 or 3 years. After that, a Lab/Startup project has to show a real profit and have a significant impact on the business to make its executive sponsor look like a rock star.

If not, 9 times out of 10 it is going to be axed, reorganized or split, I guarantee you.  More often than not, Labs don’t work because they make one of the following mistakes:

  • They focus too much on internal needs first, provide “white glove treatment” to some of their internal customers and partners, neglecting other “less important” clients and choosing to not worry about basic viability and profitability of their product for the outside world.
  • Conversely, they focus too much on external customers, trying to generate industry buzz, giving out glorious interviews and writing superb articles in industry publications, but they never try to make their product accepted and used internally.
  • They neglect to formulate a long-term road map outfitted with both quality business model and clear technological improvements.
  • Their mission includes grandiose aspirations to change internal corporate culture and deliver the world peace and forgets primal need to make gazillion dollars within the first 5 years of existence.
  • Their internal structure resembles a garage project full of geeks having way too much fun for way too long, forgetting about releasing as often as they can and holding people accountable to the scope and schedule commitments they make.
  • They get comfortable being shielded by their executive sponsor and forget about the fact that this executive is the only protection they have from packs of hungry wolves that have to deal with today’s problems and who bring home profit every quarter doing things that may be not so cutting edge.

So how do we create a sustainable internal startup that survives beyond the 2-3 years? By not doing these things. By making sure our strategy focuses on inside and outside needs. By working on business aspects of our idea as much as we work on technology.  By making our technology scale to our (prospective or real) customer needs early and well. By partnering with external organizations with which our company already has good relationships and asking, begging them to use our product and see if they like it. By making our executive sponsor look good every quarter because we landed another customer, or drastically increased scalability and performance, or shaved a significant chunk off our internal support costs, or shipped something really useful to our partners that saved or earned them real money.  By holding our development team’s feet to the fire all the time. By making our architects code and by making them help our developers understand their vision instead of isolating themselves into “I had a brilliant idea. I am great. Now you do the work” tower.  By constantly checking and challenging every decision our management makes against our long-term survival and profitability goals and against main goal of publicly releasing our product. By servicing as many customers as we can, even when it seems difficult. By relentlessly and obsessively eliminating every inefficiency and manual work in the product and in the process. By not expanding the headcount unless we have shown enough profit to pay for additional employees. By assuming that this quarter might be our last.

In short – by being a bunch of hungry artists.

Triple Constraint of Happiness at Work

Remember the Triple Constraint of Project Management? Remember how horrible and out of sorts it feels when your management attempts to maximize all three sides of the triangle?

Well, I am going to attempt to make an analogy between Project Management constraints and your Happiness at Work constraints. When I say “happiness at work”, I mean a situation when you are mostly happy to go to work every morning.  I don’t have any illusions that things could be perfect, at least not when you’ve been working in your current role for some time. I realize that there are always going to be problems, but those problems shouldn’t give you heartburn every second of every day to make you leave – those are the problems you or somebody else on your team is methodically working to solve and the rest of you are either helping or at least not standing in the way.

So what’s the constraints?  During my time in software development, I found that my personal happiness at work rests on three main constraints that are at the base of my professional Maslow’s Pyramid:

  • Product I am building
  • Team I am working with
  • Technology I am using

When all three are balanced (once again, they can’t be all perfect all the time) – I am happy. When one is persistently screwed up, I am beginning to worry. When two are out of sorts, I am headed for the exit, unless I see a way to fix the problem. Let me explain why.

Product I am building gives me and everyone around me purpose. Multiple factors contribute to good or bad feelings about the product. I like when I believe in the product, when I think it solves a real problem, when I find the problem space fascinating, when I feel connected to my users (this could be just an illusion, right?). Take this all away, and suddenly there is no purpose, no goal, no mountain to climb. It could be OK for a short while, but it gets on my nerves very quickly.

Team I work with defines my social environment at work. I like to work in a team based on mutual respect, accountability, driven by results and not too political, and who doesn’t?  I also tend to pay attention to processes that exist in a team. How do ideas get communicated up the chain? Are they dismissed out of hand, do you have to be a member of  “in” crowd to be heard? Is your management effective, accessible and open or are they just along for the ride? Do they listen to valid concerns and take action to make the team better? Do they defend the team or just themselves in battles that are worth fighting? Basically, I try to evaluate my team using one main criteria: can we conquer team dysfunction or not?

Technology I deal with on a daily basis is hugely important for me as well. Basically, I want to be able to answer “yes” when somebody asks me question #9 on the Joel Test. I have to spend my working days making technology do what I want, so it is only reasonable that I should be a pretty demanding customer. I don’t want the technology to waste my time – I want it to help me. I also want to enjoy working with it. I also want it to be marketable enough for me to learn. No matter how enjoyable and effective Smalltalk is, the reality check tells me that becoming a Smalltalk expert is probably not as good for my resume as knowing Java or C#. And this is coming from a person that loves Smalltalk. Same goes for internal tools that most companies have.  None of these tools are marketable (they are internal), so if they are clunky, old and hard to work with, I will hate them.

So what do I tend to do when any of these pillars becomes jeopardized? Well, it’s always the same two choices when it comes to your basic needs being threatened: fight or flight, except I alway fight first. I stay loyal, I try to influence non-technical decisions about the product, I try to lend my time, expertise and advise to fix my dysfunctional team or I try to push for technological choices that I consider more promising, applicable and marketable.

However, if I can’t convince myself that at least two out of three are acceptable to me, there is nothing I can do to make them acceptable and the situation isn’t getting better, I think that it’s time to move on. As loyal as I am, there is always another team, another project, another challenge.

Because life is too short to spend it in a dead-end job you hate.