Skip to main content

Managing Risk and the Shadow Backlog

As an approach to developing software agile has had a profound impact.  There are many great architectures, languages, libraries, frameworks, etc. that have helped software engineering get better. But in my experience almost none of these has been as important as the process of agile.

There are many advantages to agile but one of the most important but least articulated (especially to executive leadership...who should care the most) is risk management.  Unfortunately, this aspect of agile can also be misunderstood by the practitioners: the development and test team.

So why does agile reduce risk?  Because the agile process creates transparency.  Nothing creates more risk than hiding information.  Nobody can make a decision to reduce/avoid risk if they are not operating with good information.  This is not to suggest that teams who are developing using waterfall approaches are intentionally hiding the truth of their project's progress.  Rather, the problem is that it is simply very difficult to be truthful.  In waterfall it is hard for the team to assess actual progress due to the lack of any feedback until the end of the project.  In other words, the team cannot really tell how far they have to go because there is so much uncertainty about how much work remains. In contrast to agile where completed work is continuously being delivered the waterfall approach "guesses" what is coming in future stages (and even in the active state).  Leadership, of course, wants answers and so teams are highly motivated to make assessments/estimates of their status and future work.  So...they provide very bad information.  This is bad risk management.

Agile allows the team (and leadership) to begin to understand the merit of the idea (and actual effort) much earlier.  If a waterfall project lasts one year the leadership team won't really know if the idea has merit until they have spent a year of costs. That actually assumes the best case: the design and development team perfectly understood what is needed to be built and perfectly estimated the time and cost.  However, the reality is that most product development (agile OR waterfall) is very dynamic.  The team learns more about what it should do and how it should do it as the work continues.  In agile, leadership might learn in a few months that the idea isn't going to work OR that the work will actually be much more expensive than originally believed. At this point the team can make a more informed decision: kill the project or invest more.  That is good risk management.

Some agile teams unintentionally subvert the risk management objective by misunderstanding it's importance.  I refer to this subversion as the "shadow backlog".

Several behaviors can create a "shadow backlog" but bug handling is the most common mistake leading to this problem.   It is manifest when teams decide to defer the fixing of bugs while they are working on stories.  This often happens because leadership is intensely watching the progress of stories and ignoring bugs.  If the team wants to make leadership happy they will work on the stories, not the bugs: "we'll find time to work on the bugs later".  In the context of this problem there are two broad categories of bugs:

  • Bugs the product owner has already decided need to be fixed before the product can be released (we'll call these "required") and
  • Bugs the product owner has decided can compete with other functionality.

Not all bugs should be fixed. Repeat: Not all bugs should be fixed. In fact, if a team ships a product with 0 open bugs it would suggest they haven't tested well enough or they spent too much money on fixing defects. ALL software is released with thousands of bugs. Most, we would hope, the team hasn't discovered yet but many are still open because the product owner has decided other functionality was more important than fixing those bugs. Ultimately bugs are just little features. Bugs should be managed in the backlog just like stories. As such it is totally OK to let bugs drift to the bottom of the backlog if they are in the second category (not required).

"Required" bugs are different.  Again, "required" bugs are problems the product owner already knows MUST be fixed before the product can be released.  These bugs should be fixed as soon as possible for at least two reasons:
  1. Familiarity.  It is well accepted that the earlier a bug is found and fixed the lower the cost. Part of this can be as result of the bad user experience but usually this cost is simply measured through the process overhead (the continuum of actively developing the feature vs. fixing a production issue) and developer efficiency.  The latter is almost exclusively about familiarity. Having the same developer who created the bug fix it shortly after development is probably in the order of 10x more efficient than having a different developer fix it long after is was discovered.  This isn't even counting the cost of red herring analysis, repeated "re-discovery" of the bug before it is fixed, and additional risk of creating new bugs because the developer doesn't understand the consequences of the change.
  2. Risk management. Deferring bugs that must be fixed adds to a shadow backlog.  This shadow backlog creates bad information.  The team is communicating that "we only have 2 sprints left of stories left" when, in fact, there might many more sprints of bug fixes to follow (hard to estimate until the developer triages the bugs).
Another shadow backlog creator is the "over achiever". In this scenario the awesome developer/test team takes it upon themselves to add additional features as they do work. This is often innocent: "We realized all of our views need a logoff button while we were doing the 'logoff button on dashboard' story so we are going to add this right now". The problem with this over achiever approach is that it cuts out the backlog prioritization process. NO application should ever get every feature desired by it's product ownership. That would surely result in wasted money. The pressure of time and cost is an important guide for software development. Teams should implement the highest priority work. The over achiever risks reducing the capacity of the team by adding the un-prioritized work. Perhaps there is another feature that the product owner wants to do but they just lost x hours of total capacity to adding the logoff button to every view. Again, this is a risk management issue. By introducing work that was not prioritized the backlog and future capacity are harder to judge.

The last shadow backlog scenario in this little drama is the "deferred story". This is similar to the required bug problem. The team understands that the product is missing some important features that are "a must" for delivery into production.  Sometimes this is functionality.  Often this is something more system oriented: analytics, monitoring, performance features, etc. Sometimes this is simply some horribly fractured code which desperately needs refactoring. These features must be included in the app for release. Pretending they are not there simply misstates the progress of the team. More work in the shadow backlog. Again, bad risk management.

It is often easy to think that risk management is somebody else's concern but the reality of risk management is that if effects everything:  stress, chaos, hurried quality, cost, etc.  We all care about managing risk.



Comments

  1. It is interesting sometimes to watch teams where management does not value refactoring work or other technical debt. They don't see the drag that technical debt can have on a project and would rather prioritize features instead. How do developers participate in the prioritization process so that important things that don't add new features can bubble to the top of the list?

    ReplyDelete
  2. I hear that concern from dev teams regularly but I usually discover that nobody has actually communicated the concept of tech debt to the product owner. My personal experience is that the product owner is quite sympathetic if they understand this is a normal part of the agile process.

    ReplyDelete

Post a Comment

Popular posts from this blog

Why is Performance Important?

I often lament the transition from desktop to web application development. The web development ecosystem was so inferior to the desktop that we are still trying to catch up with the state of the art 20 years ago. One of the earliest victims of the move to web was performance. In desktop apps the expectation was immediate (maximum of a hundred milliseconds) response times. When the world wide web arrived we were suddenly facing 10-20 second response times. Dark days for user experience. Since the late 90s the web has slowly improved. We are now able to experience near-desktop performance with our web apps. Most of this is due to dramatically improved bandwidth. Another contribution was the move to ajax, web services and single page apps. Rather than making page requests for each transition we are now making very small, efficient data requests.  In spite of these improvements I still find teams setting very low expectations for performance (example: "3 second" service ...

We need a new front-end language

I like to preach to software engineers that       the essence of software engineering is    designing for complexity . I also preach that       coding = design . So transitively speaking       coding = designing for complexity . I deeply believe this (and you are going to hear it often in this blog).  While there are many important skills a coder brings to the job designing for complexity (DC henceforth) is our bread and butter. Yes, there are other important talents we bring to coding... Performant algorithms, Cool animations Understanding new and adjacent new technologies etc. These are important and often are the most fun part of the programming job but we spend most of our time (80%+) on DC.  A good programmer is thinking about design when writing every line of code.  "How will it relate to the other code in this app?", "how will it stand-up to change?", "will the next developer (or even me 6 month...