An addendum to my earlier post on the Cone of Uncertainty. You shouldn't read this, it's too long and boring.

Today the mailman brought me my used copy of Boehm's Software Engineering Economics (1981). This is the "original" source for the CoU, and I'd been feeling silly about not owning a copy, given how often it is cited.

Anyway, I eagerly turned to page 311 which I knew from an earlier quick skim of a borrowed copy was where the "cone" was discussed. (Boehm doesn't give it a name.)

And found a footnote that I missed the first time around: "These ranges have been determined subjectively, and are intended to represent 80% confidence limits, that is 'within a factor of four on either side, 80% of the time'". (Emphasis mine.)

This puzzled me, because I'd always thought that the Cone was drawn from empirical data. But no. It's strictly Boehm's gut feel.

And then I chanced across this bit from a bibliography on estimation from the Construx website: "Laranjeira, Luiz. 'Software Size Estimation of Object-Oriented Systems,' IEEE Transactions on Software Engineering, May 1990. This paper provided a theoretical research foundation for the empirical observation of the Cone of Uncertainty."

What empirical observation?

Curious, I turned to the 1990 paper. Its first three words are "the software crisis" - for a software engineering mythbuster, that's a very inauspicious start; the "crisis" being itself a software engineering myth of epic proportion - possibly the founding myth of software engineering.

The fun part is this bit, on page 5 of the paper:


>> Boehm studied the uncertainty in software project cost estimates as a function of the life cycle phase of a product. The graphic in Fig. 2 shows the result of this study, which was empirically validated [3, Section 21.1] <<
The reference in brackets is to the 1981 book - in fact precisely to the section I'd just read moments before. Laranjeira, too, takes Boehm's "subjective" results to be empirical! (And "validated", even.)

He then proceeds to do something that I found quite amazing: he interprets Boehm's curve mathematically - as a symmetrical exponential decay curve - and, given this interpretation plus some highly dubious assumptions about object-oriented programming, works out a table of how much up-front OO design one needs to do before narrowing down the "cone" to a desired level of certainty about the schedule. Of course this is all castles in the air: no evidence as foundation.

Even funnier is this bit from McConnell's 1996 book "Rapid Software Development":

>> Research by Luiz Laranjeira suggests that the accuracy of the software estimate depends on the level of refinement of the software's definition (Laranjeira 1990) <<

This doesn't come right out and call Laranjeira's paper "empirical", but it is strongly implied if you don't know the details. But that paper "suggests" nothing of the kind; it quite straightforwardly assumes it, and then goes on to attempt to derive something novel and interesting from it. (At least a couple later papers that I've come across tear Laranjeira's apart for "gross" mathematical errors, so it's not even clear that the attempt is at all successful.)

So, to recap: Boehm in 1981 is merely stating an opinion - but he draws a graph to illustrate it. At least two people, McConnell and Laranjeira, fall into the trap of taking Boehm's graph as empirically derived. And someone who came across McConnell's later description of Laranjeira's "research" should be forgiven for assuming it refers to empirical research, i.e. with actual data backing it.

So what do I make of all this?

First, that there is a "telephone game" flavor to this whole thing that is reminiscent of patterns I've seen elsewhere, such as the 10x myth. The technical term for it is "information cascade" (http://en.wikipedia.org/wiki/Information_cascade), where people take as true information that they should be suspicious of, not because they have any good reason to believe it but because they have seen others appear to believe it. This is, for obvious reasons, not conducive to good science.

Second, the distinction between empirical and conceptual science may not be clear enough in software engineering. Mostly that domain has consisted of the latter: conceptual work. There is a recent trend toward demanding a lot more empirical science, but IMHO this is something of a knee-jerk reaction to the vices of old, and may end up doing more harm than good: the problem is that software engineering seems bent on appropriating methods from medicine to cloak itself in an aura of legitimacy, rather than working out for itself methods that will reliably find insight.

Third, people should stop quoting the "Cone of Uncertainty" as if it were something meaningful. It's not. It's just a picture which says no more than "the future is uncertain", which we already know; but saying it with a picture conveys misleading connotations of authority and precision.

If you have things to say about software estimates, think them through for yourself, then say them in your own words. Don't rely on borrowed authority.
Shared publiclyView activity