This method is essentially a combination with the Delphi method of estimation that I have used before, and a agile method of estimation. Check out my friend Travis’ post on Cost Estimation for more on what prompted me to try something different.
Planning poker is the normal method that my team uses to come up with project estimates. Planning poker is excellent for getting everyone on the same page. I think this is a good idea and works well, but I like to try new things to see what works best. I very much believe that experimentation on a continuous basis is the only way you can discover things that work better what you doing.
One of my primary goals for this experiment is to find a method which avoids the phenomenon of anchoring as much as possible.
Let’s see what it looks like.
Step One: Deliverables Defined
Make sure that you have your work breakdown structure defined properly. If you’ve read my book, you know that the work breakdown structure is extremely important to me, regardless of whether you are working on a waterfall project or an agile/leave project.
If you don’t start out with a good definition of your scope, then everything else down the line is for naught. Garbage in, garbage out.
Step Two: Relative Sizing Delphi
The first round I used for my teams this time relied on the relative sizing method that many agile teams use when trying to come up with estimates for their projects.
So there was a range of T-shirt sizes between XXS to XXL and everything in between that my team to select from.
Whenever using any kind of relative sizing model, you need to make sure that your teams are comfortable with the first. Be sure to explain to them that these are completely relative, which means that one person’s L is different then another person’s L. And that’s okay.
The primary goal of this first round is to make sure that we have a good discussion and that all appropriate questions are raised about what exactly it is that we’re supposed be doing. It’s really kind of a refinement of the work breakdown structure process.
Step Three: Evaluate Relative Results
After you’ve gotten the relative sizes right now it seems to them all into a spreadsheet so that you can see them side-by-side. You’ll also have a list of questions to go track down the answers to from Step Two.
In some cases there will be a lot of agreement and consistency in the relative estimates, and other cases there will be wide variances. With relative sizing you must remember that one person’s L is going to be equivalent to someone else’s M since no calibration of the estimates have taken place. That’s fine. It’s when we have people who think a feature or item is an XS and others who think it’s a L, XL, or XXL where we need to revisit the discussion. Similar to what we do with planning poker, I facilitate discussion by asking questions when we have a high variance involved.
Step Four: Absolute Estimates
In this step we do two important things.
First, we discuss the results of the relative sizing step, without any reference to who estimated what. Remember, we are trying to eliminate anchoring. The focus here is to further refine the understanding of the scope, and have the questions answered from the previous session.
Second, we do a new delphi round, this time using hours. I like to use effort estimates, and base them on an ever-increasing scale to reflect that larger estimates are more uncertain. When between two estimates, tell the team to always round up. The hours I use are 5,8,13,20,40,80 – anything that ends up being larger than 80 hours can probably be broken down more discretely and you can iterate that estimate.
Step Five: Analysis and Final Results
Finally, I put all the effort estimates side-by-side again. I look at the optimistic and pessimistic values, take the median for likely, and calculate PERT on each feature/item.
Now there is some judgement involved. I look for cases where the person(s) with the most domain knowledge on a particular item differ widely from the PERT calculated value. Specialty and experience of the estimators is an important factor. If I have one person who is really a subject matter expert on a particular item and the rest of the team is not, I value the SME’s estimate more.
This assessment happens for each line item, and in the end I have a rollup I can use as well as standard deviation and other metrics to not only back up the estimates, but provide input for risk or reserve needs. I add things like documentation, meeting time and general overhead not directly related to implementation, and testing may be derived from implementation effort and historical numbers or estimated separately as it’s own set of activities.
We’ll see how this experiment turns out. What do you think about this approach?
[photo by Improve It]