Wednesday, February 2, 2011

Technical Debt is a Form of Waste

Technical debt. The term has been popular for a while and is used to refer to a lot of scenarios. It was originally coined by Ward Cunningham to help us recognize that quick and dirty coding sets us up for increased future development effort. It has expanded to include the operational support of code. In my experience, one of the most overlooked forms of technical debt is created when operational support requirements are excluded from the project life cycle.

In many companies, operational support is not considered part of the project life cycle. Operational and project work are often completely decoupled at the lower and middle levels of the IT department, only dovetailing because both divisions report to the CIO or VP of IT. However, this very component of the life cycle, operational support, can deliver the most value to the customer.

This disconnect is created when communication between IT and the rest of the business is limited. When the business defines objectives and timelines for a project with limited or no interaction with the operations team, the operations requirements that are needed to support various projects are presented in an abbreviated form or left out entirely. The project team might not even know that operations requirements exist until the code is almost ready to go live, if then.

I recently participated in an emotionally-charged meeting that was scheduled to review the root cause of the failure of a client-facing system. The project team was angry with the operations team for not supporting their code in a highly-available (HA) environment. The operations team was defensive, stating that they didn’t have the time or resources to address their backlog, including the HA environment.

The project team specified an HA environment in their runbook as part of their handoff to IT operations. The HA system was acknowledged by the operations team as a requirement to adequately support the live code, prioritized it and assigned the creation of an HA system to an operations resource. However, the HA system’s completion was not considered release-gating by the project team, so the project was declared complete as soon as the code was live.

The project team expected their live code to be considered part of the production system, with accompanying up-time SLAs. The IT operations group was working on the HA environment, but didn’t have dedicated resources on the project, so its completion was a long, drawn-out process. The same people working on HA were also doing regular maintenance tasks, working on systems for other projects, and putting out fires.

The key point of failure in this conflict was in not consciously determining the value of the various requirements in the project. It is important in this type of dispute for the business to determine where the customer value delivery exists. Most IT departments don’t consider customer value when negotiating this type of delivery. Without this determination, it is almost impossible to correctly determine where to focus resources.

Part of the miscommunication was created in the beginning of the project, when people from the operations team were not included in the project team. It was exacerbated when the code was permitted to launch without an HA environment. It really snowballed when the project team went on to the next project without fully delivering the value of the code in terms of the delivery life cycle. The operations team, with its own unique requirements, pushed code live without ensuring that the requirements were completed, and then repeated the pattern all over again.

After just a few project lifecycles, it becomes clear that the decoupling of IT operations requirements from business requirements within consecutive projects creates a backlog for IT operations that can’t be surmounted without drastic remediation efforts. Furthermore, this lack of oversight often reveals a lack of understanding of value delivery life cycles. When domains are permitted to hand their delivery over the fence to the next group they tend to value less quality over quantity.

Fortunately, this problem has a simple solution. It’s not easy, because it requires being realistic about how many projects your company can complete in a given amount of time. If you have are in the habit of ignoring IT operations requirements, you might be surprised at how much paying attention to them will slow your project completion rate.

You might have to adjust your understanding of the number of operations people you truly need to support your project work. You need to understand the value you’re creating for your customer is not realized until what you have developed is fully implemented and the customer is finding the value in it. You have done nothing relative to value delivery if you’re not ensuring the product you’re delivering is delivered in full.

In summary, pay attention to three key aspects to prevent a buildup of technical debt as a side-effect of project work:

  1. Make operations requirements release-gating.
  2. Don’t launch on temporary environments to get the product out.
  3. Deliver your product to deliver value, not presence.Bring operations people into the project early and often in the life cycle. In fact, make them an integral part of the life cycle.

Projects are not complete unless the systems on which they live are complete and delivering value to the customer. Anything not providing value to the customer is considered waste. Be the change agent in your organization in halting the creation of technical debt.

Jen Browne and Patrick Phillips

Tuesday, January 18, 2011

Lean Brings Value to IT Operations

Lean practices are becoming increasingly integral with traditional business practices. Lean provides a way for companies to increase value, meaning that a company's clients are willing to pay for a greater percentage of that company's activities, increasing their bottom line.

At its heart, Lean provides an avenue for businesses to reduce waste. Waste is simply work which does not add value to an organization’s service or product. The reduction of waste can result in the recognition of tangible results such as lower transportation costs, greater individual productivity or reduced spending on inventory storage.

With documented, real-world results, it's no wonder that Lean has taken hold throughout modern business management. However, the use of Lean as a management tool is noticeably absent in one area: IT operations. I'm not pointing the finger at these teams of intelligent, motivated people. After working to bring the value of Lean to IT operations for the last decade, I understand their unique difficulties.

Even in companies that embrace Lean practices in IT development, the operations side of the house generally finds Lean to be a foreign topic. Understanding the resistance to Lean can be better understood when one understands the basic nature of IT operations as an interrupt-driven entity.

Lean requires established processes as guidance but many IT operations teams struggle to quantify this operational work. We fit occasional improvements in between putting out fires. We don't always have defined processes, established metrics, or a strong understanding of how individuals on the team accomplish their responsibilities as part of the whole. Even shops with firm frameworks, like ITIL or MOF, often display gaps in process mapping.

A good example of an IT operations group not understanding its own processes came to me from Niel Nickolaisen, a highly respected IT turn-around CIO and proponent of Lean IT. He was talking with an IT operations group about their process for moving changes into production.

They had two processes: the normal process and the emergency process. The normal was long, complex, and bureaucratic. The emergency process was simple and straightforward. So, everyone made sure that their changes were an emergency so that they could follow the simple, straightforward process. I asked them how they had determined the few steps in the emergency process. They said that they had analyzed the normal process and identified the essential process steps and included those in the emergency process. I told them to make the emergency process the only process. The only difference between normal and emergency should be timing. The proposed changes that follow the normal process are reviewed once a week. The emergency changes require gathering the reviewers RIGHT NOW to review the proposed change. Otherwise they are identical. And, since they both only include the essential process steps, there is no reason to create an emergency to bypass process complexity.

The value of creating a Lean process for this team seems obvious, in retrospect, but the waste created by the added complexity of the normal process existed for years until the process was identified and refined. Add up a few of these "obvious" process improvements, examine the resulting waste, and it becomes apparent that there is great potential for bringing the value of Lean to IT operations.

Here’s how you can start thinking about Lean in your IT operations. Today, familiarize yourself with Lean and its uses within IT. Use a mind map tool to fully understand your organization’s work, the number of services it delivers, and where you are putting your collective effort.

Next month, identify a target process from the mind map and value stream that process. This will allow you to begin to organize the foundations for Lean. With your identified process you can start a Lean pilot in your operations group.

Next year, begin to expand its application and the breadth of tools being deployed, including Agile, Self-Service, etc. Based on your early experiences with applying Lean, your organization will begin to see additional process where you can direct your Lean focus.

Jen Browne and Patrick Phillips

Popular Posts