Sunday, July 6, 2008

Agile Project Management: Lessons Learned at Google

Agile Project Management with Scrum (Microsoft Professional)At QCON London 2007, Jeff Sutherland described Google’s Ad Words team’s approach to implementing SCRUM. He started by describing the similarities between the Toyota way and the Google way. At Toyota, the team gets together and 1st produces a working prototype. After examining and learning from the prototype and determining what failed and what worked, another prototype is made. By constant improvement, excellence is built in both the team and process. At Google, management was removed and the developers were empowered to take charge of their own products. They were in direct contact with the users and were able to get continues feedback. They would quickly learn when they failed and be able to move on to the next thing.

Next Jeff describes different social structures:

1. Bureaucracy: Disciplined and coercive – Rigid rules enforcement, extensive rules and regulation, hierarchical controls

2. Autocracy: Whimsical and coercive – Top down control, minimum rules and procedures, hierarchical controls

3. Leadership: Disciplined and empowering – empowered employees, rules and procedures as enabling tools, hierarchy supports organizational learning

4. Organic: whimsical and empowering - empowered employees, minimum rules and procedures, little hierarchy

Then Jeff gives a brief overview of the Ad words product. It has millions of users. It’s in many languages. When released, there needs to be new training. It’s about 500KLOC and growing. There are 5 distributed teams and multiple projects going on with the teams. Because of size and complexity, a group of managers get together regularly. They noticed they did not have enough information to make basic decision.

The ScrumMaster introduced some structure gradually. He asked developers what their biggest problem was. They identified missing dates as a major issue. The ScrumMaster then suggested a release burn down chart to track progress. He knew that because he is doing only 1 of the Scrum practices that there will be many other things not working, but will address the others gradually. He did not implement anything top down. He asked engineers about their problems and then suggested possible ways to solve them. After adopting the burn down chart, the next set of problems was QA not knowing how to test the features, duplication of work, and dependency creation. The ScrumMaster suggested having daily meetings to address these issues. However, the burn down and chart and daily meetings were not enough to meet dates. Velocity was decreasing week by week, but tech leads still insisted that they will deliver on time. The problem was tracking finished tasks vs. finished features. It’s a work in progress problem, so they started keeping # of opened tasks to a minimum.

The 1st retrospective on this partial implementation revealed that the burn down chart helped and that there was more team work and more QA into the process but there were still too many bug, priorities were fuzzy and the team was still blowing dates. The need to clearly prioritize the product backlog was identified along with iterative development. They extended the definition of done to include all development including testing. They added the Sprint burndown. This would show failure a lot sooner.

Gradually all practices of SCRUM were implemented to pass the Nokia test. Jeff defines the Nokia iteration test as having timeboxed iterations of less than 6 weeks with tested and working software at the end of the iteration (unit and functional testing). Also, the iteration must start before the specification is complete.

Then there is the Nokia Scrum test which means you know who the product owner is, the product backlog is prioritized by business value, the estimates are created by the team, they generate the burndown charts and know their velocity, and there are no project managers disrupting their work.

The team kept on operating the same way even after the ScrumMaster went on vacation for 3 weeks.

This presentation is available on InfoQ at