Friday, March 11, 2011

Leveraging Agile Principles in IT Operations: 2 of 4

Working systems over comprehensive documentation

Agile Overview
Last week I discussed how you can reap the benefits of a structured framework and create opportunities for continuous improvement in your IT operations team by emphasizing individuals and interaction over processes and tools. This concept is based on the idea that the processes and tools that your IT operations team uses, including the organizational framework, should be driven by the team’s values rather than the desire to watchdog productivity.

This week I’m going to talk about the second precept of the Agile Manifesto: Working software over comprehensive documentation. Although software is integral to the work of a site operations team and members of the team are often required to code as part of their regular work, creating working software is not the primary function of an IT department, so we’re going to modify this precept to reflect the core competency of IT operations: Maintaining working systems over comprehensive documentation.

The Documentation Monster
If you want to provoke a negative emotional response in your IT operations team, tell them everyone needs to focus on documentation. It’s not fun. It’s hard to know where to start. It takes away from the time your team can devote to deploying, fixing and maintaining systems. It’s not generally the strongest skill in the IT professional’s toolkit.

On the other hand, it can save a significant amount of time when troubleshooting. It can foster knowledge transfer. It speeds the learning curve for new hires trying to find their way around your systems. It can be integral to a good disaster recovery plan.

Given the advantages and drawbacks to documenting your systems, how can you find a reasonable approach to creating the documentation?

Document Key Information
The key difference in this approach lies in understanding the difference between comprehensive documentation and valuable documentation. Valuable documentation supports the needs of your team.

Lay the groundwork for creating a culture that embraces documentation by discussing the values of your group. What drivers would make documentation valuable to you? As a team, determine what system information you need to track. This will generally come down to four key ways the knowledge will be used:
  1. Reducing troubleshooting time
  2. Sharing knowledge
  3. Training new hires
  4. Supporting disaster recovery
Recognizing the value and purpose of documentation can support the culture of creating and maintaining it. Keep in mind that if the information doesn’t strongly support how your group will use the documentation, it is extraneous. Extraneous information is a waste of time and energy to track and doesn’t maximize the IT operations team’s value stream.

Maintaining Working Systems
It’s rare to have the luxury to build systems and documentation together from the ground up. Many start-ups are scrambling just to have working systems, much less document them. Most of us have to start with a system that is already in place.

Examine your system. Is it mature? Do you have instrumentation in place, including telemetry, monitoring and alerting, failure detection, and comprehensive, readable logs? Does it fail gracefully? Is it designed for recovery? Are the necessary high availability sub-systems in place? These are the foundation for maintaining working systems. Identify your gaps, prioritize them and create a plan for putting them in place.

Document as You Go
It’s not practical to drop the regular work of an IT operations department and go on a documentation spree for a week. For one thing, everyone will take the week off. For another, knowing where to start documenting is a daunting proposition.

Instead, document as you go. When a team member is working on a standard tasks – deploying, fixing and maintaining – use the opportunity to look at the system from a documentation standpoint as part of the standard task. Refer back to the ways your group will use the documentation as a guideline to creating it.

Another way to create documentation is to embrace failure. When your systems fail, use the opportunity to strengthen them, address the root causes, and document both the fix and the new, stable state. This can support future troubleshooting efforts while creating training material.

Looking Ahead
In part 3 of this series I will examine how the third precept of the Agile Manifesto, customer collaboration over contract negotiation, supports an IT operations environment.

Jen Browne and Patrick Phillips

Popular Posts