Sunday, December 13, 2009

The Tyranny of the Plan

Leading Lean Software Development: Results Are not the PointMary Poppendieck gave a talk on lean software development at the UK Lean Conference. Mary started out by describing the process of building the empire state building. The building was exactly on time and 18% under budget:

· 9/22/29 – demolition started

· 1/22/30 – excavation started

· 3/17/30 – construction started

· 11/13/30 – exterior completed

· 5/1/31 – building opened

Mary next explained how they managed to do that before there were computers, GANTT charts and PERT charts. The builder focused on workflow. They did not need to figure out the details and lay out a plan. There was no design when contract signed. They used deep experienced on a fixed priced contract. They focused on key constraint which wasn’t labor, steel or stone. It was material flow of needing the right stuff at the right time (500 trucks a day delivering materials, no storage on site). The building was designed to be decoupled. There were 4 pacemakers which were independently scheduled workflows. They avoided having cascading delays. They understood that it pays to invest money to save time (cash flow thinking). Every day of delay cost 10K (about 120K today). Schedule was not laid out based on the details of the building design. They created the schedule and created the design to fit the schedule. The building was designed based on the constraints of the situation (2 acres, zoning ordinances, 35M capital, laws of physics, and May 1, 1931 deadline). Traditionally we start by figuring out what you are going to do, break it down into pieces (WBS), sum up the total and there is your schedule. Instead here they started with the constraints and created a schedule that can fit within the constraints.

Implementing Lean Software Development: From Concept to CashMary summarized the lessons learned. Design the system to meet the constraints; do not derive constraints from the design (budget, time). Decouple workflows; break dependencies from architectural point of view and scheduling point of view instead of organizing them on a schedule (PERT chart). Workflows are easier to control and more predictable than a schedule. For control and predictability establish a reliable workflow instead of establishing a schedule that needs to be followed.

Next Mary explains that there are 2 reasons that we schedule. The 1st is to control when things will happen. However, detailed schedules are deterministic and do not allow for normal (common cause variation). Machines break down, weather, no way to deal with variations unless we add slack. Attempting to remove common cause variation from a system will cause it to oscillate wildly. Flow systems on the other hand build in slack whenever they need it. Managing a level workflow is a lot easier than following a deterministic schedule.

The 2nd reason we schedule is to predict when things will happen. However, schedules based on experience are reliable. Schedules based on wishful thinking are not. Schedules from summed up from task breakdown are guesses (hypothesis about the future). When reality does not match the schedule the schedule hypothesis is disproven.

Lean Software DevelopmentVariance from plan (earned value) is seen as a performance failure. We should view it as the schedule’s fault and use it as a learning opportunity. Don’t set performance targets. A schedule is a guess so measuring variance against the plan is not beneficial. Complex system will always fail. A failure demonstrates a lack of understanding of the system. Failure is the system talking to you. Listen, it’s information. Take it as an opportunity to learn more and improve the system then you will be able to harness that complex system gradually and more completely over time. If you take it as looking for the person that created the plan (blame) then you will not get better at managing the system.

Next Mary explains how we can use pull scheduling to know when things are going to be done. For example, in the 1st 2 month understand the customer’s interest and figure out the technical approach. At 5 months, review the proof of concept and decide the alpha release features. At 8 months, produce the alpha release, decide beta release features, at 11 months produce the beta release, and decide 1st baseline release, at 13 months review beta results, and decide 1st release plan, and at 15 months produce the 1st release. Base estimate on experience and data, not wishful thinking. Time-box, don’t scope box. Do not ask how long this will take; instead ask what can be done by this date. Integrating events create cadence and pull.

Next Mary describes reliable workflow. Receive input, produce output and handoff. Ask and give feedback. It consists of:

· Output: desired customer outcomes.

· Pathway: sequence of activities with a clear workflow

· Connection: Test-driven handovers expose problems, feedback to supplier

· Method: clearly defined methods/standards, baseline for improvement

· Improvement: never add value to defective input, verify that your output meets the needs of your immediate customers, when a problem is exposed, find and fix the root cause.

Finally, Mary explains where plans come from. PERT was attributed to the success of the Polaris project which was completed in 3 years instead on 9 years. In reality, PERT was sold by its creator (Raiborn) to congress and used it as a fa├žade to keep congress happy. It was designed to track extremely aggressive schedule and cost was not important and not managed by PERT. It was not used to manage the project in the early years. It was bypassed as unrealistic by technical officers and dismissed as worthless by contractors.

Polaris successful because

· Quality of Leadership: Technical Directory maintained control over performance requirements and made and directed all key technical decisions. Designed requirements to meet the schedule rather than design the schedule to meet the requirements.

· Focus on Deployment: primary goal was an operational system as early as possible. Focus on getting submarines in the water as soon as possible. Resulted in timeboxed iterations of technical advancement. 3 stages, simple, more complicated, complex stage.

· Decentralized, Competitive Organization: Technical people had control of their own decisions. A range of options was developed for all decisions. For every major subsystem, there was at least 3 contractors working on it. The best one won. Sounds expensive but one of best ways to keep a reliable schedule

· Emphasis on Reliability: Stringent quality control with excessive testing. Redundant systems were designed in

· Esprit de corps: Personally fostered by technical director. Teamwork and commitment were encouraged and rewarded. Everybody engaged in the system.

This presentation is available on InfoQ at