06 January 2009

Risk Management for Software Projects

Large software projects are risky because

  1. They involve people.
  2. They are can be complex. 
  3. They depend on people and companies outside your direct control, such as OS makers and 3rd party library vendors.
  4. There is little flexibility to make big changes at end of big software projects. This was described elegantly in the Mythical Man Month.
Risk Management Strategies
  1. Understand your project's final deliverable very well, including your customer's needs and quality requirements. 
  2. Manage for an appropriate level of risk, not zero risk, which is unachievable in any case. Learn which risks give the best returns and which lead to the worst problems. 

    1. Take the costs of managing risks into account; don't spend more time measuring, avoiding and mitigating risk than you get back from doing so. 
    2. The appropriate level of risk depends on the benefit of your project succeeding, the cost of it failing and the probability of it failing.
  3. Base schedule estimates on reasonable expectation of things going wrong from time to time. An honest schedule allows you to eliminate non-essential features at the start of a project and improve its chance of success. It also wins you the respect of seasoned developers, sets a tone of realism, and provides the foundation for a successful project.

    1. If in doubt then pad your estimated schedule based on the accuracy of your previous schedule estimates. e.g. If your last few projects took 1.5 as long as you expected them to take then multiply your current schedule estimate by 1.5. This is a way of admitting the limitations of your ability to estimate schedules. As you get better at estimating schedules you will be able to pad less. Meanwhile it is critical that you understand and compensate for your limitations. Padding is necessary.
    2. Take a minimax  approach and minimize your project's reasonable worst case schedule. The reasonable worst case depends on your project's circumstances but it will be something like the 90th percentile, 95th percentile or other high percentile in the distribution of possible schedules.
  4. When your project gets started, keep track of your real schedule. Be good at discovering reality. 

    1. Work with today's real schedule and expectations. This is similar to the popular Agile  methodology. Even if your company does not use Agile methodology, once development starts you need to deal with the current reality of your project. If the reality differs from the plan then you need to deal with the reality. If you can regain your original schedule that's great but you must deal with the reality while the original plan differs from the current reality.
    2. Devise good metrics and use them. Use the metrics to compute a real schedule. E.g. #open bugs may be an accurate metric that reflects the true state of the project, but it probably doesn't predict ship date accurately since bug fixes (like any code change) introduce bugs. Code churn will almost certainly be a better predictor of ship date, but the code churn is meaningful only as long as the bug fixes are being driven to the final (shipping) bug fix number . 
    3. Remember that projects need to converge. Doing extra work at the end of the project invariably leads to slippage. 
    4. Don't shortchange any critical upstream development activities (Steve McConnell  #20). If something is going to have to be done then do it at the most efficient time. Design is much more effective if it is done before coding. Testing is more effective if code is written with testing in mind. Etc. All incomplete critical activities are risks. 
  5. Minimize the number of risks in a project. 

    1. Try to avoid introducing multiple new technologies in a separate project.
    2. If you cannot one or more major risks (e.g. new technology, new supplier, new market) in a project then avoid non-essential risks. e.g. If you have to introduce a new technology for the project but you can delay the new supplier until the next project then do so. If your company is large enough to support multiple simultaneous projects then spread the risks between projects.
    3. People are part of the risk. If you have to take a risk then don't add to the risk with  inexpert or untried developers or developers who are subject to external pressures. Likewise this is a bad time to use newly formed teams, teams whose members don't back each other up or teams going through major issues.
    4. Infrastructure and organization are part of the risk. If you have to take a risk then support the risk-taking team with your organization's best infrastructure, including IT, HR, facilities, etc. Shield the risk-taking team from distractions such as re-organizations, moving to new premises, learning a new email system, heavy personnel review processes, non-critical training etc.
  6. Break down major risks. Break down complex tasks into several smaller tasks so that one failure won't bring the whole large task down. 

    1. In a series of tasks with measurable completion milestones, failures to achieve the milestones will become apparent early while there is still time to recover.
    2. Breaking a big task down into task a set of concurrent tasks can either increase risk by adding coordination risks or decrease risk if some of the sub-tasks can fail without causing the whole task to fail. If the coordination risk can be minimized and sub-task failures are tolerable then this is a good strategy.
  7. Move risky items to the start of the project. If something goes wrong with the risky items this gives you time to address the problem while code and designs can be changed without high risk. Performing this step rigorously will distinguish well risk-managed projects from other projects. 
  8. Keep a top N (say N=10) risks list. Risks are everywhere and eternal vigilance is required. However not all people have this mindset. A top N risks list is an easy-to-grasp way of communicating risks to a group.
  9. Follow good software development and management practices. Avoid the classic mistakes because they can be easily avoided by reading the list (which I wish I had read before I made most of them). Many standard software engineering principles minimize risk, so don't unlearn them when you work on your first commercial product. In particular don't stop using good development practices when your project is under pressure.
  10. Expect the unexpected.

    1. Never ever forget Murphy's Law. Expect to make mistakes. 
    2. Effective risk management requires sensitivity to risk. If you are not emotionally wired in this way then you will need to learn how to think this way .
    3. Keep in mind that while the past at best provides an imprecise guide to the future (see the item on schedule padding), at worst it provides no indication of the future at all. This is exemplified by Taleb's TurkeyImagine that you're a turkey. You've eaten well and lived in safety every day of your life. Everything in your experience tells you that tomorrow will be no different. Then Thanksgiving arrives.
    4. As advised above, adapt to the currently reality, even it differs from your plan.

Summary of Risk-Based Schedule Prediction

  1. Knowns. Plan these rigorously.
  2. Known unknowns. Pad schedule for these. e.g. predicted schedule + 2 std devs
  3. Unknown unknowns. Requires eternal vigilence and adapting to the current reality.

What Makes Risk Sensitive Managers Different

Risk management is not just part of software development project management like design, scheduling and presentation skills. A truly risk-sensitive approach to project management requires explicitly managing by risk. This means that if your scheduling and prioritization is not based on realistic risk estimates then your are not managing risk well. For example
  1. In a project involving 100 engineers, a group of 3 engineers are working on a completely new piece of code that implements difficult algorithms for a critical deliverable and every other engineer is making incremental changes to the new code, then a risk sensitive manager would focus on this group of 3 engineers.
  2. If a project introduced a new software technology, brought in a new hardware supplier and entered a new untested market then a risk sensitive manager would try to break it into 3 projects, each with only one risk.
  3. MOST IMPORTANT EXAMPLE. A risk manager assumes that a) many things could go wrong will go wrong and b) there is limited scope to correct mistakes at the end of a project. Therefore a risk sensitive manager will trim features at the start of a project to give the project a reasonable chance of success. 
Risk sensitive managers manage like this even if doing so is at odds with other management techniques.
  1. In example 1, the risk sensitive manager would be focusing on the risky group of 3, even if it meant PRDs coming in late and the low-risk 97 engineers getting less than optimal help. The risk sensitive manager would work hard to move work from the group of 3 to other people in the 100, and negotiate with other clients of the group of 3 to lighten their load.
  2. In example 2, the risk sensitive manager would lobby high and low through his/her company to avoid the dangerous confluence of risks.
  3. Example 3 illustrates what separates risk sensitive managers from average software managers, the absence of wishful thinking in decision making. In #13 in the previous link Steve McConnell says Wishful thinking isn't just optimism. It's closing your eyes and hoping something works when you have no reasonable basis for thinking it will. Wishful thinking at the beginning of a project leads to big blowups at the end of a project. It undermines meaningful planning and may be at the root of more software problems than all other causes combined.
And good risk managers always attack uncertainty.

Further Reading

Risk Management bared down to one Question
Why MS Project Sucks for Software Development

1 comment:

Kaj said...

Take a look at QPR's risk management products.