Shared publicly  - 
 
An addendum to my earlier post on the Cone of Uncertainty. You shouldn't read this, it's too long and boring.

Today the mailman brought me my used copy of Boehm's Software Engineering Economics (1981). This is the "original" source for the CoU, and I'd been feeling silly about not owning a copy, given how often it is cited.

Anyway, I eagerly turned to page 311 which I knew from an earlier quick skim of a borrowed copy was where the "cone" was discussed. (Boehm doesn't give it a name.)

And found a footnote that I missed the first time around: "These ranges have been determined subjectively, and are intended to represent 80% confidence limits, that is 'within a factor of four on either side, 80% of the time'". (Emphasis mine.)

This puzzled me, because I'd always thought that the Cone was drawn from empirical data. But no. It's strictly Boehm's gut feel.

And then I chanced across this bit from a bibliography on estimation from the Construx website: "Laranjeira, Luiz. 'Software Size Estimation of Object-Oriented Systems,' IEEE Transactions on Software Engineering, May 1990. This paper provided a theoretical research foundation for the empirical observation of the Cone of Uncertainty."

What empirical observation?

Curious, I turned to the 1990 paper. Its first three words are "the software crisis" - for a software engineering mythbuster, that's a very inauspicious start; the "crisis" being itself a software engineering myth of epic proportion - possibly the founding myth of software engineering.

The fun part is this bit, on page 5 of the paper:


>> Boehm studied the uncertainty in software project cost estimates as a function of the life cycle phase of a product. The graphic in Fig. 2 shows the result of this study, which was empirically validated [3, Section 21.1] <<
The reference in brackets is to the 1981 book - in fact precisely to the section I'd just read moments before. Laranjeira, too, takes Boehm's "subjective" results to be empirical! (And "validated", even.)

He then proceeds to do something that I found quite amazing: he interprets Boehm's curve mathematically - as a symmetrical exponential decay curve - and, given this interpretation plus some highly dubious assumptions about object-oriented programming, works out a table of how much up-front OO design one needs to do before narrowing down the "cone" to a desired level of certainty about the schedule. Of course this is all castles in the air: no evidence as foundation.

Even funnier is this bit from McConnell's 1996 book "Rapid Software Development":

>> Research by Luiz Laranjeira suggests that the accuracy of the software estimate depends on the level of refinement of the software's definition (Laranjeira 1990) <<

This doesn't come right out and call Laranjeira's paper "empirical", but it is strongly implied if you don't know the details. But that paper "suggests" nothing of the kind; it quite straightforwardly assumes it, and then goes on to attempt to derive something novel and interesting from it. (At least a couple later papers that I've come across tear Laranjeira's apart for "gross" mathematical errors, so it's not even clear that the attempt is at all successful.)

So, to recap: Boehm in 1981 is merely stating an opinion - but he draws a graph to illustrate it. At least two people, McConnell and Laranjeira, fall into the trap of taking Boehm's graph as empirically derived. And someone who came across McConnell's later description of Laranjeira's "research" should be forgiven for assuming it refers to empirical research, i.e. with actual data backing it.

So what do I make of all this?

First, that there is a "telephone game" flavor to this whole thing that is reminiscent of patterns I've seen elsewhere, such as the 10x myth. The technical term for it is "information cascade" (http://en.wikipedia.org/wiki/Information_cascade), where people take as true information that they should be suspicious of, not because they have any good reason to believe it but because they have seen others appear to believe it. This is, for obvious reasons, not conducive to good science.

Second, the distinction between empirical and conceptual science may not be clear enough in software engineering. Mostly that domain has consisted of the latter: conceptual work. There is a recent trend toward demanding a lot more empirical science, but IMHO this is something of a knee-jerk reaction to the vices of old, and may end up doing more harm than good: the problem is that software engineering seems bent on appropriating methods from medicine to cloak itself in an aura of legitimacy, rather than working out for itself methods that will reliably find insight.

Third, people should stop quoting the "Cone of Uncertainty" as if it were something meaningful. It's not. It's just a picture which says no more than "the future is uncertain", which we already know; but saying it with a picture conveys misleading connotations of authority and precision.

If you have things to say about software estimates, think them through for yourself, then say them in your own words. Don't rely on borrowed authority.
9
5
Christophe Addinquy's profile photoLaurent Bossavit's profile photoGus Power's profile photoDan Mullineux's profile photo
9 comments
 
I'm sad the cone is not empirical. I'm glad you took the steps to find the evidence. I'm glad you shared the reality.

The concept of the Cone of Uncertainty still feels anecdotally correct. I will have to drop the math in future descriptions of it.

 
OK Laurent, you can count me as falling into the trap as well...
 
A cone like shape - perhaps with some power law is a good metaphor for the kind of chaotic influences on a software project. The parameters are closely related to how similar a project is to anything else the team has worked on, how stable the tools and technologies are, how controllable the third party dependancies are etc etc. etc. ad infinitum... but at least its some shared 'metric' we can be hanged with when it go's tits up.
 
The underlying assumption with the cone appears to be that the closer we are to a future event, the more (relevant) data we have about it and hence the more we should be able to quantify or understand it against our current position. The assumption appears reasonable (except for 'Black Swan' events), at least for systems that are not unduly influenced by uncontrollable external factors (markets, weather, people :) ). I wonder if the idea originated from the idea of causality that relativity introduced (http://en.wikipedia.org/wiki/Light_cone)?
 
Hi Gus! What's weird about the Boehm "cone" is that it's inverted: it narrows toward the future instead of widening.

Cones that widen toward a more uncertain future are an intuitive depiction for natural events. There's a "cone of uncertainty" in some writings about real options, which is derived from a binomial lattice (think of a road that keeps forking, and at each fork you turn left or right at random).

Projects are not natural events: they're a result of us "steering" the future in unnatural directions. Perhaps that's where an inverted code might make sense as a metaphor. But it doesn't strike me as intuitive at all.
 
Hmmm... I've not seen the original, but I thought the cone represented the uncertainty over time. As Gus states, when time passes and we're closer to an expected event (or to the original expected date, anyway), then our uncertainty is less. Is that not what Boehm depicts in his book?

Looking at http://en.wikipedia.org/wiki/Cone_of_Uncertainty that's certainly not the depiction used for hurricane tracks.

BTW, that article also refers to Gorey, J.M. (1958). "Estimate Types", AACE Bulletin-November 1958.
and Bauman, H.Carl (1958), "Accuracy Considerations for Capital Cost Estimation", Industrial & Engineering Chemistry, April 1958.
 
Hi George, I'm trying to get my hands on that Bauman article. I'm not convinced that Boehm was in any way inspired by that article though; it's a relatively recent addition to the WP page and as far as I can tell was never cited by Boehm.
 
Hi Laurent!

Thanks for the great research job.

I agree with George. Many subjects/events are somewhat unknown on the project day 1. Some of them can be almost unknown. As the project evolves, these subjects events tends to be somewhat known. Some of them turn to be completely known. Even some new unknowns can show up, the tendency is that uncertainty diminish. On the project last day, it is expected that no unknown persists.

The cone shape only reminds us that on project day zero (planning), with a huge amounts of uncertainty, we should apply large error margin to our estimates. As the project goes on, and the uncertainties come down, we can estimate with more confidence (lesser error margin).

Paraphrasing you, the cone is just a picture which says no more than "the future is uncertain", but the uncertainty decreases during the project life-cycle, and this should be considered for estimations.
Add a comment...