Monday, January 31, 2011

Continoues Delivery

Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation (Addison-Wesley Signature Series (Fowler))Jez Humble talks about delivering software fast and reliably at a DevOps conference. He refers to Mary and Tom Poppendieck's book Implementing Lean Software Development and asks: “How long would it take your organization to deploy a change that involved just one single line of code? Do you do this on a repeatable, reliable basis?”  To get an idea of what that is, checkout the bottom part of It say flckr was last deployed x hours ago, including y changes by z people.

Jez recalls a situation where their 1st attempt to deploy took them 2 weeks. He was working for an ISP provider with a team of 60 developers. They had setup file caching which worked great on windows development environment but failed when deployed onto the production solaris cluster. The problem was that they had made assumptions for windows that did not apply to solaris.

Implementing Lean Software Development: From Concept to CashJez explains that to do real testing, we need to be in a production like environment.  In agile environments we are very good at doing analysis, development, testing, and demos in a team environment but then we pass things over to QA and IT operations. These hand-offs are an anti pattern. The team needs to include everyone involved in delivery and deployment needs to be fully automated. If it takes 2 weeks to deploy then it takes more than 2 weeks to get feedback.

Jez explains that releasing frequently is important for 3 main reasons:

1. Fast feedback: Tight cycle between thinking of an idea, releasing some software that represents it, getting feedback from users, and very quickly releasing new versions of software. Example: Flckr started out as gaming, but then realized the people are using it for photo sharing.

2. Reduce risk: With a big bang release, the individual amount of change is large. If releasing frequently, the delta is very small. It is easy to figure out where the problem is and it is easy to roll back if needed.

3. Real project progress: New software is only done when it is released. Before being released it is not delivering value.

To achieved continuous delivery, start with value stream mapping. Talk with everyone involved in delivery, measure time taken doing vs. waiting and then create an automated system that embodies the process. Every time anyone makes a change, run unit test and run acceptance test on production like environment. Everyone has to collaborate to get changes into production as quickly as possible. Use cycle time as a metric for measuring the productivity of the team.

When you are setup for continuous delivery, releases are no longer tied to operational constraints. Business can decide when to release.

Jez refers to research done by Forrester that some business will run internally as if they are start-ups. Business units will act as VC’s. They have funding and can provide it to projects which the business wants to invest in. They run projects as a product team not a project team. The key is not to disband the team when the project goes lives. We need to keep the team going as long as the product is in production.

Jez then goes over some principles for continuous delivery:
  • Repeatable, reliable process for releasing software. Releases should be push buttons.
  • Almost everything should be automated. We cannot automate exploratory testing and user acceptance testing, but we can automate almost everything else.
  • Keep everything in version control.
  • If it hurts, do it more often. Pain will force you to change your process and automate things.
  • Build quality in. Everyone is responsible for testing. Developers, operations, etc.
  • Done means released. Done means delivering value. We are not delivering value until the software is released.
  • Everyone is responsible for delivery. Cross functional teams. Everyone should work together to fix bugs.
  • Continuous improvements. Focus on biggest bottlenecks and improve those and then move on to the next ones and so on.
Next Jez cover practices:
  • Only build binaries once. Can’t gurantee that the thing that you are releasing is the same thing that you are testing. Separate out config stuff from binaries. 
  • Deploy the same way to every environment. Create package and use puppet to deploy to every environment.
  • Smoke test your deployment. Check that everything you depend on is in place. Ping them!
  • Keep your environment similar. That is, make sure the process to manage the environments is similar. 
  • If anything fails, stop the line. Everyone should focus on fixing the problem.

Jez continues by briefly discussing continuous integration, testing, canary releasing, and data migration:

Continuous integration:
  • Continuous is more often than you think. Do not use branches for features. Always check in to the main branch. 
  • If you must branch (moving to a new architecture), then branch by abstraction and use a config option to choose which implementation to use.
  • Check in new features to the main line, but hide them to turn them off if not ready for production
  • Unit tests, integration tests, and system tests: Focus on technology and programming practice.
  • Functional acceptance tests: end to end system tests that verify business value.
  • Non functional acceptance tests: performance, scaling …
  • Showcases, usablility, explarotarty testing: this is the manual stuff that testers should be spending their time on. Everything else should be automated.

Canary releasing: Release new version of software to small subset of nodes, then route subset of users to these new nodes. This helps with A/B testing, and performance testing.

Data migration: Data migration needs to also be part of the deployment process. It needs to be automated. Data migration can be abstracted (using views, triggers) so that it works with multiple versions of your application.

Next, Jez explores objections to continuous delivery:
  • Locking down – need change control: Reason is that things go wrong. But having visibility into what is coming and changing and control to rollback reduces this need to lock things down. 
  • Compliance: comply using automation over documentation
  • Auditing: see who does what.
  • Make it easy to remediate outages
Finally Jez emphasizes that people are the key. Get everyone together from the beginning, keep meeting, make it easy for everyone to see what’s happening and keep continuously improving.

This presentation is now available on infoq at