Home

Pflogging

the never-ending quest for pragmatic solutions, useful plans, flawless execution, and designs that endure
Home Areas of interest Conventional wisdom Execution discipline

User login

  • Create new account
  • Request new password

A number of key features are only available to registered users. They include:

  • Access to the full content of top-rated material (only teasers are available to anonymous users after the material has been posted for 45 days)
  • The ability to search site content
  • The ability to access reviews of books relevant to site material
  • The ability to access key quotes relevant to site material
  • The ability to access content from partner sites
  • The ability to rate material
  • The ability to post comments
  • The ability to post new information and propose it for publication
  • The ability to request email notification when selected content is added or updated

Confronting uncertainty

  • View
  • links
Submitted by Bryan Pflug on Thu, 04/01/2010 - 22:34
  • Execution discipline

We are surrounded by uncertainty. We rush to the airport, through unpredictable traffic, only to discover that the regularly scheduled flight we arrived early for has been delayed indefinitely. We watch last year's championship team fail to even make the playoffs this year. We invest in graduate school for our children, only to discover that they may be as likely to go bankrupt as get rich from the experience. Along the way, we may notice the frequency that we are impacted by uncertainty in time and outcomes. This insight may motivate us to attempt to answer a basic traveler's question: Should our journey or our destination be more important to us?

If we feel the need to achieve both goals, we may recognize that two options present themselves - better provisioning for our travels, or destinations more suited to our preparations. Should we find ourselves in situations where achieving our destinations are a necessary condition of our survival, we may have little choice among these paths. But to strive to perform beyond our comfort levels is likely to push up against an in-bred resistance to commitments, which results from a natural tension between our desire to please, and our fear of failure. In such situations, research indicates that human behavior prefers inertia over interventions (whether delvered through shoves or nudges). Simply put, our behavior when we have insurance is different than when we don't have it.

There are many sources for the uncertainty which places us in this dilemna; not many of these sources are under our control, even when we embrace best practices, local innovations, or tradition. When we study throughput across enough individuals over a sufficiently long timeframe, we likely will recognize that their outcomes fall within a normal distribution with a surprisingly wide range.

Consider 10 independent and concurrent activities, each which must be completed on time, and a follow-on task with depends upon all of these preceeding tasks being completed before this follow-on task can be done. If you think of each of these outcomes as a binary coin toss, heads you win, and tails you don't, you realize that the only way you win on an estimate with no safety margin is to toss 10 heads in a row - the probably of which is less than a tenth of a percent, in a normal distribution.

If only things were that rosy. You see, project outcomes aren't normally distributed, but have long tails to the right. Yet business planning typically expects things to be reduced to a single answer, defining when we'll be done. Thus, unless there are opportunities to alter the amount of work to do in the period allocated to do the work, or a paradigm shift in how to plan or accomplish the work is achieved, the numbers simply work against your ability to achieve predictable outcomes... the only question becomes how to book the losses. Of course, if these activities are not parallel, but sequential, delays accumulate, and things can get worse in a hurry.

Engineering development is especially subject to these factors, since development activities are primarily journeys of discovery. Our ability to perform such work within fixed schedules are typically determined by hidden constraints and unknown tolerances. Just as it takes only one accident anywhere along our commute in order to make us late for work, the impacts of these discoveries on our engineering activities accumulate over time. Since the causes and effects are often separated in time, it is difficult to recognize the underlying factors which caused this uncertainty to take place in the first place, unless we are able to recognize and act upon it quickly and effectively. Yet if we do act, the consequence is never experienced, and may thus appear insignificant. Yet once such effects are aggregated within a top-level schedule, the associated delays are either offset by other tasks which were completed faster than expected, or become difficult to isolate from other factors. As a result, we lose the opportunity to recognize our patterns of behavior which trigger uncertainty. And through this , we also fail to learn from those patterns.

Steve McConnell and others have describes the impacts of such factors on software projects as the Cone of Uncertainty. His funnel representation of this uncertainty, as depicted on the right, is intended to highlight two things. First, it reinforces that in the absence of an organized approach to planning and execution, we will not consistently manage the cost or value we can deliver for our projects, unless we are very lucky. Trouble will find us, and when it does, we will be forced to react to it as it occurs. The impacts of such trouble (as depicted in the light blue cloud in the figure on the right) can easily affect project outcomes by a factor of 4 or even more. When such growth in cost and schedule occur across many projects, it is an indication that the planning and estimating processes and approaches which are being used on those projects are inadequate. While all uncertainty can never be eliminated, it should be able to be reduced to acceptable limits as quickly as possible (as depicted by the blue lines in the graphic).

The further a project has progressed, the fewer the opportunities there are to address these sources of uncertainty. As a result, understanding the sources of this uncertainty, and acting on them, can be a significant advantage for projects. A white paper from Liquid Planner summarizes why managing this uncertainty is crucial to making better decisions on projects:

Ill-defined tasks are full of uncertainty. When you defer these tasks until late in the project, whatever risk buffer you have remaining is likely too small to accommodate the uncertainty. But because uncertainty is virtually impossible to detect when using single-point estimates, the project manager is often completely unaware that the schedule is very likely to slip, even if common sense says it is so. Since, in this scenario, you are unaware that you need to adjust your schedule to retain sufficient risk buffer, you end up with little or no buffer, Murphy strikes, and the project is late.

For my own projects, I believe that the biggest sources of such variation have arisen from:

  1. Having to abandon plans or designs after the project's requirements or scope has changed
  2. Not taking into account the variations in performance across different team members
  3. Failure to recognize or mitigate risk.
  4. Relying on technology which was unproven, unfamiliar, or still evolving
  5. Not having the right information, tools, or resources when we needed them to perform the work
  6. Making rushed (and sub-optimum) decisions based upon available, but incomplete information
  7. Allowing communications breakdowns to occur that led to significant rework
  8. Being careless in implementing one or more steps of an operation
  9. Overlooking one or more details while performing a complex task
  10. Implementing a more robust solution than was necessary

Unfortunately, we usually have nothing but our own faulty memories and biases to discover such sources, since the environments which we typically use to perform our work do not allow us to capture or account for such factors. Further, it is easy for us to become victims. It is easy for us to blame others for nearly everything on this list. For example, item 3 above can just as easily be expressed as "Encountering more adverse conditions than we had planned for'. Similarly, item 5 could be written as "Not getting good enough requirements". But such expressions place organizations in a reactive, rather than proactive, world view that is quite difficult to escape from.

Until we accomplish accurate and unbiased assessments of our own performance and its causes, a requirement to define a confidence levels of future performance usually leads us to view our past experiences through rose colored glasses. This was one of the ideas behind the Personal Software Process. If such information could be recorded at its source, as our work is performed, it would be much more reliable, accessible, and useable for such purposes. At some level, with enough discipline, and with data collected over a long enough period of time, we should be able to come to understand the relative importance of such factors on the accuracy of our schedule predictions, and use that information to improve our predictions going forward. After all, remarkably good predictions have been teased out of unpredictable events as diverse as Olympic Medal performance and political upsets. Yet we rarely have the luxury of having as much time or data as we would like in making such predictions. Given this, we must come clean about what we know and what we don't, and accept the fact that even though we may believe we are capable of beating the odds (the Lake Wobegon Effect), we will not be able to do it consistently.

We avoid acknowledging this uncertainty in discussions with our leaders and our customers. They expect us to make accurate predictions about future events, and then deliver on these commitments. Yet we rarely are successful at this. In response, we too often seek and find ways to fudge our predictions and our results - perhaps by blurring responsibilities for parts of our assignments, by re-shuffling and re-phasing our plans, by warning about all the other things that should be fixed instead, or by focussing exclusively on only the things under scrutiny at the time. And none of these tactics is satisfying or meaningful, when the business situation we are in is demanding and unforgiving.

We traditionally try to respond to this uncertainty by trying to add a safety margin to our estimates. But we find we cannot add much of a safety margin, because our leaders believe that the effort we make on projects tends to expand to fill such time (the Hawthorne effect). If we have extra time, we use it to do our work to make it as good as it possibly can be, rather than give it back to the project to mitigate the problems others may encounter. Sometimes, we even put off some of our work until the last minute (the student syndrome. As a result, leaders do everything they can to squeeze out every bit of continency they sense is in our schedule. Yet research indicates that individual developers are more likely to overcommit than they are to pad their estimates, due to a desire to please their customers. After following this path several times, we usually find ourself in a conflict between adding too much safety, and adding too little. As we gain experience, we probably come to realize that delivering on our commitments is only possible through combinations of discipline, flexibility, risk management, and good fortune.

The insurance we add to our projects to insurance ourselves from this uncertainty is not license for us to extend our estimates arbitrarily, or avoid focusing on our goals. The amount of insurance we end up with is a compromise between opposing views that may all be legitimate under different scenarios. But the effects of overestimating and underestimating do not have offsetting impacts on project outcomes. Steve McConnell points out that underestimates have a greater negative impact on project outcomes than the benefits overestimating are assumed to offer, since underestimates so often result in nonlinear penalties which arise from associated planning errors, accumulation of technical debt, and utilization of high-risk practices.

Rather than attempting to make precise, single point predictions about ambigious, future events, it is better to first reach an agreement on priorities for deliverables, and establish a target confidence level for schedule predictability. With this information, each outcome can then be elaborated into the specific actions that will produce the optimum result in the short and long term. For each of these actions, we should then identify both an upper and lower bound for what it will take to perform that work, based upon evaluations by the responsible individuals of their experience with similar efforts in the past. The purpose of estimating this range is to integrate and communicate the team's understanding of the uncertainty potential for all future tasks. Thinking about and communicating this uncertainty accomplishes several things. It reinforces that this uncertainty exists. It provides us with an opportunity to budget for this uncertainty, when aggregated over multiple activities. Finally, the information produced by this analysis can be used to identify the best opportunities to reduce this uncertainty, and thus shorten the overall duration of the project. Not all of this uncertainty can be reduced, but the opportunities to tackle it can be prioritized (and should be funded through funds expressly budgetted for these purposes), so the easiest ones with the greatest impacts can be actively and successfully pursued. Discovering these opportunities is extremely useful in increasing the likelihood that tasks complete early, rather than late. And the information necessary to discover these opportunities is only available to us if we honestly and transparently confront this uncertainty in a structured manner.

As an example, consider an engineering development effort that have the likelihood probabiliy distributions that are depicted in the the chart on the right. One of the early tasks required in this hypothetical effort might be to elicit and reach agreement on the critical requirements for a key feature of the system under development. Such a task might be expected to take between 6 and 10 working days in order to complete, with a confidence level of 80%.  If communicated to all stakeholders, such information would signal that in 4 out of 5 similar efforts, prior projects have accomplished similar work within 10 days.

As related tasks are planned, and dependencies with this requirements development determined, the collective activities should be sequenced so that the most likely chain of time is developed and represented to provide protection from such variablilty propogating within and between tasks. One often mistakenly believes that by sequencing successive tasks using a 50% likely completion time from a prior event, and aligning those to the earliest start date for the successor task,  the resulting task chain will then have a 50% likely chance of completing. What happens instead is that delays cascade, and decrease the probability of final success to 1/(2**n), for the number of tasks in the chain. Thus, if one has estimated 4 tasks, each with only a 50% chance of completing according to schedule, the likelihood of completion on time is reduced to under 10%. Instead of such optimistic planning, a set of safety time buffers, or task float times, should instead be identified and placed along the longest path of planned activities using the critical path method.

Once such uncertainty in an activity's time performance is identified, further reviews might indicate that this uncertainty was primarily due to concerns about whether a shared vision had been embraced by the subject matter experts who were participating in the requirements development activities, and the uncertainty of how long it would take to integrate their viewpoints. That information, and similar estimate ranges for subsequent tasks, might lead you to conclude that there was only a 50% confidence level in realizing the targetted delivery date, 5 weeks in the future. That might lead to a recommendation for an alternative schedule postulating delivery in 9 weeks.

Based upon this information, the project manager may choose to negotiate an additional safety and commit to delivery within a 10 week schedule. Alternatively, the project manager could chose a more aggressive (and riskier) target of 8 weeks. However, it would be foolish to stick with the 5 week schedule, as such a stretch will likely just cause shortcuts to be taken en-route, which will then increase rework and become the source of expensive technical debt that will be difficult to address in the long run. Regardless of which of these alternatives was selected, the information most critical to decision-making is to identify which activities will have the greatest potential to produce such uncertainty. From that information, mitigation opportunities can be prioritized and selected which have the greatest chance of reducing this uncertainty before the task actually begins. In this way, alternative approaches can be identified, evaluated, and validated 'just in time', in order to utilize as little of the project's 'insurance' as necessary.

Engineers have a professional responsibility to include uncertainty in their estimates, so that we can maximize the percent of promises which we complete. The IEEE-CS/ACM Software Engineering Code of Ethics requires that software engineers "Ensure realistic quantitative estimates of cost, scheduling, personnel, quality, and outcomes on any project on which they work or propose to work, and provide an uncertainty assessment of these estimates." Firms and contracting agencies have similar challenges; in "Can we afford our own future", Deloitte recommends that government agencies require "all program cost estimates... be reported as a range of likely costs that include the associated levels of risk and uncertainty." The first step in solving such problems is for us to admit we have the problem in the first place.

0
Your rating: None
‹ The quest for predictability up Useful planning by design ›
  • Login or register to post comments