Leprechauns and Stemmatics - a tutorial
I want to write a little bit about my methods. In the process, we'll unmask a new Leprechaun. We will also see that Leprechaun-hunting is easy - you can do it too. It's a matter of attitude.
Stemmatics, a branch of something called "textual criticism", is the study of text transmission. Here is a longer description, courtesy of Gregory Mayer:
Stemmaticists carefully study texts, and attempt to determine which copies were made from which other copies. The copying is usually traced by the introduction of small copyist's errors, which are then perpetuated in any copies descended from the one in which the error arose. Tracing these small errors allows one to trace the history of the manuscripts.
This is basically what I do, using Google. Having a comprehensive index of nearly the entire Internet, reaching back very far back in history, that you can search in an instant - I don't know any stemmaticists personally, but Google strikes me as the Promised Land of stemmatics.
What I do is start with a claim that strikes me as dubious. Today I was looking over various instances of the "46% of features never used" claim, and I noticed it was frequently accompanied by something like the following:
The U.S. Department of Defense (DoD), when following a waterfall lifecycle, experienced a 75% failure rate.
(This is from http://techdistrict.kirkk.com/2010/02/10/agile-the-new-era/ - there are many, many other citations to this work, "Jarzombek 1999".)
Exercise: find more citations, and note the context in which they occur.
My second reflex in such cases is to use the Google "search by date" feature to try and locate the earliest possible citation. In this case, I soon found a 2002 article: http://sunnyday.mit.edu/16.355/leishman.html
Exercise: use the Google "search by date" feature yourself, using the search terms "Aerospace, Jarzombek, 1999" and try to replicate my results.
At the 5th Annual Joint Aerospace Weapons Systems Support, Sensors, and Simulation Symposium in 1999, the results of a study of 1995 Department of Defense (DoD) software spending were presented. A summary of that study is shown in Figure 1. As indicated, of $35.7 billion spent by the DoD for software, only 2 percent of the software was able to be used as delivered. The vast majority, 75 percent, of the software was either never used or was cancelled prior to delivery. The remaining 23 percent of the software was used following modification.
Now, of course $36Bn is a huge sum, and most certainly a representative sample. These numbers gave me pause.
My first reflex is to locate the original source - the cited text itself. On this occasion, though, as on many others, the original article is nowhere to be found. The "Joint Aerospace Weapons Systems Support, Sensors, and Simulation Symposium" or "JAWS S3" seems to have been "a thing", as the saying goes - but Google appears to have no trace of it.
Most people would give up there. As I said above, Leprechaun hunting is a matter of attitude. All you have to do is not give up, ever.
Exercise: (optional) - find the email address of one of the people citing Jarzombek; email them, politely and courteously asking if they still have a paper or electronic copy, and would mind sending you a PDF or a scan. This is an advanced practice. I have done this on several occasions, instead of giving up. I haven't done it in this particular case, but you might be enlightened by the responses (or lack thereof).
Stemmatics to the rescue: my third reflex is to locate an important part of the claim and see if I can find an occurrence of that using Google.
Zeroing in on the pie chart in the CrossTalk article, I tried searching on the categories: for instance "Software used, but extensively reworked or abandoned" sounded promising.
Why? Because this phrase is a complex disjunction, unlikely to be independently reinvented by two authors. The expression "software used after changes", by comparison, could turn up in many places.
Exercise: try this Google search for yourself.
Unfortunately, searching for the full phrase turns up nothing new; the 2002 CrossTalk article, plus two later (2007 and 2009) copies.
Now can we finally give up? Of course not. We forge on. Stemmatics again: the original text is likely to have been somewhat distorted. This particular phrase still sounds promising, but we can try to take a guess at what it might have been distorted from.
Exercise: come up with your own variants on the phrase and Google them. Note your results.
My next move was to try dropping the "or abandoned". What I found filled me with shock and perverse joy. Googling for "software used but extensively reworked" leads among other things to this: http://bit.ly/XQWNCv
This is a 1979 report by the "Comptroller General of the United States". It is a very official document, hosted on a legal documents website, quite unlikely to be a fake.
Page 11 is the smoking gun. It's a pie chart looking very much like the one in the 2002 article. The labels for the various categories match up almost exactly:
- "Software that could be used after changes" vs "Software used after changes"
- "Software that could be used as delivered" vs "Software used as delivered"
- "Software delivered but never successfully used" vs "Software delivered, but not successfully used"
- "Software used but extensively reworked or later abandoned" vs "Software used, but extensively reworked or abandoned"
- "Software delivered but never successfully used" - identical.
- "Software paid for but not delivered" - identical.
Stemmatics strongly suggests, even at this stage, that the (so far unseeen) 1995-1999 text is a copy, mutated, of the 1979 text.
Exercise: estimate the probability that the 1995-1999 text was independently rewritten by a different author
What's more interesting even is the numbers. The GAO data was obtained as follows:
We examined nine cases in detail, which included visits to both agency and contractor sites, examining documents, and interviewing those persons involved who could still be reached.
The total dollar amount of the projects involved was $6.8M - three orders of magnitude smaller than the $37Bn claimed for the 1995 study, even after adjusting for inflation.
Yet, the percentages match up almost exactly:
- $119,000 out of $6,8M is 2%
- $198,000 out of $6,8M is 3%
- $3,2M out of $6,8M is 47% (compared to 46%)
- $1,95M out of $6,8M is 29%
- $1,3M out of $6,8M is 19% (compared to 20%)
Again, a quick probabilistic assessment: it is virtually impossible for a 1995 study on thirty billion dollars' worth of projects to turn up the exact same numbers (within 1% tolerance) as a 1979 study of seven million dollars' worth, totally independently - and also by coincidence use the exact same categories to classify projects. (Some time after writing the preceding paragraph, I wrote a Python simulation, based on reasonable assumptions, to quantify "virtually impossible" - such a coincidence would happen in fewer than one in a million random runs.)
Conclusion? Even though I have never seen, and probably will never see, the Jarzombek "study", I know it cannot be true.
This doesn't mean, by the way, that I'm accusing Stanley Jarzombek of making stuff up: I couldn't say anything definite on that until I saw the original text. It seems at least equally plausible that someone who actually read Jarzombek made a copying mistake, and somehow confused a 1995 survey on billions of dollars' worth of projects with the 1979 survey, perhaps cited by Jarzombek.
What matters is that the Jarzombek citation is a Leprechaun - totally bogus. Another one bites the dust.
I want to write a little bit about my methods. In the process, we'll unmask a new Leprechaun. We will also see that Leprechaun-hunting is easy - you can do it too. It's a matter of attitude.
Stemmatics, a branch of something called "textual criticism", is the study of text transmission. Here is a longer description, courtesy of Gregory Mayer:
Stemmaticists carefully study texts, and attempt to determine which copies were made from which other copies. The copying is usually traced by the introduction of small copyist's errors, which are then perpetuated in any copies descended from the one in which the error arose. Tracing these small errors allows one to trace the history of the manuscripts.
This is basically what I do, using Google. Having a comprehensive index of nearly the entire Internet, reaching back very far back in history, that you can search in an instant - I don't know any stemmaticists personally, but Google strikes me as the Promised Land of stemmatics.
What I do is start with a claim that strikes me as dubious. Today I was looking over various instances of the "46% of features never used" claim, and I noticed it was frequently accompanied by something like the following:
The U.S. Department of Defense (DoD), when following a waterfall lifecycle, experienced a 75% failure rate.
(This is from http://techdistrict.kirkk.com/2010/02/10/agile-the-new-era/ - there are many, many other citations to this work, "Jarzombek 1999".)
Exercise: find more citations, and note the context in which they occur.
My second reflex in such cases is to use the Google "search by date" feature to try and locate the earliest possible citation. In this case, I soon found a 2002 article: http://sunnyday.mit.edu/16.355/leishman.html
Exercise: use the Google "search by date" feature yourself, using the search terms "Aerospace, Jarzombek, 1999" and try to replicate my results.
At the 5th Annual Joint Aerospace Weapons Systems Support, Sensors, and Simulation Symposium in 1999, the results of a study of 1995 Department of Defense (DoD) software spending were presented. A summary of that study is shown in Figure 1. As indicated, of $35.7 billion spent by the DoD for software, only 2 percent of the software was able to be used as delivered. The vast majority, 75 percent, of the software was either never used or was cancelled prior to delivery. The remaining 23 percent of the software was used following modification.
Now, of course $36Bn is a huge sum, and most certainly a representative sample. These numbers gave me pause.
My first reflex is to locate the original source - the cited text itself. On this occasion, though, as on many others, the original article is nowhere to be found. The "Joint Aerospace Weapons Systems Support, Sensors, and Simulation Symposium" or "JAWS S3" seems to have been "a thing", as the saying goes - but Google appears to have no trace of it.
Most people would give up there. As I said above, Leprechaun hunting is a matter of attitude. All you have to do is not give up, ever.
Exercise: (optional) - find the email address of one of the people citing Jarzombek; email them, politely and courteously asking if they still have a paper or electronic copy, and would mind sending you a PDF or a scan. This is an advanced practice. I have done this on several occasions, instead of giving up. I haven't done it in this particular case, but you might be enlightened by the responses (or lack thereof).
Stemmatics to the rescue: my third reflex is to locate an important part of the claim and see if I can find an occurrence of that using Google.
Zeroing in on the pie chart in the CrossTalk article, I tried searching on the categories: for instance "Software used, but extensively reworked or abandoned" sounded promising.
Why? Because this phrase is a complex disjunction, unlikely to be independently reinvented by two authors. The expression "software used after changes", by comparison, could turn up in many places.
Exercise: try this Google search for yourself.
Unfortunately, searching for the full phrase turns up nothing new; the 2002 CrossTalk article, plus two later (2007 and 2009) copies.
Now can we finally give up? Of course not. We forge on. Stemmatics again: the original text is likely to have been somewhat distorted. This particular phrase still sounds promising, but we can try to take a guess at what it might have been distorted from.
Exercise: come up with your own variants on the phrase and Google them. Note your results.
My next move was to try dropping the "or abandoned". What I found filled me with shock and perverse joy. Googling for "software used but extensively reworked" leads among other things to this: http://bit.ly/XQWNCv
This is a 1979 report by the "Comptroller General of the United States". It is a very official document, hosted on a legal documents website, quite unlikely to be a fake.
Page 11 is the smoking gun. It's a pie chart looking very much like the one in the 2002 article. The labels for the various categories match up almost exactly:
- "Software that could be used after changes" vs "Software used after changes"
- "Software that could be used as delivered" vs "Software used as delivered"
- "Software delivered but never successfully used" vs "Software delivered, but not successfully used"
- "Software used but extensively reworked or later abandoned" vs "Software used, but extensively reworked or abandoned"
- "Software delivered but never successfully used" - identical.
- "Software paid for but not delivered" - identical.
Stemmatics strongly suggests, even at this stage, that the (so far unseeen) 1995-1999 text is a copy, mutated, of the 1979 text.
Exercise: estimate the probability that the 1995-1999 text was independently rewritten by a different author
What's more interesting even is the numbers. The GAO data was obtained as follows:
We examined nine cases in detail, which included visits to both agency and contractor sites, examining documents, and interviewing those persons involved who could still be reached.
The total dollar amount of the projects involved was $6.8M - three orders of magnitude smaller than the $37Bn claimed for the 1995 study, even after adjusting for inflation.
Yet, the percentages match up almost exactly:
- $119,000 out of $6,8M is 2%
- $198,000 out of $6,8M is 3%
- $3,2M out of $6,8M is 47% (compared to 46%)
- $1,95M out of $6,8M is 29%
- $1,3M out of $6,8M is 19% (compared to 20%)
Again, a quick probabilistic assessment: it is virtually impossible for a 1995 study on thirty billion dollars' worth of projects to turn up the exact same numbers (within 1% tolerance) as a 1979 study of seven million dollars' worth, totally independently - and also by coincidence use the exact same categories to classify projects. (Some time after writing the preceding paragraph, I wrote a Python simulation, based on reasonable assumptions, to quantify "virtually impossible" - such a coincidence would happen in fewer than one in a million random runs.)
Conclusion? Even though I have never seen, and probably will never see, the Jarzombek "study", I know it cannot be true.
This doesn't mean, by the way, that I'm accusing Stanley Jarzombek of making stuff up: I couldn't say anything definite on that until I saw the original text. It seems at least equally plausible that someone who actually read Jarzombek made a copying mistake, and somehow confused a 1995 survey on billions of dollars' worth of projects with the 1979 survey, perhaps cited by Jarzombek.
What matters is that the Jarzombek citation is a Leprechaun - totally bogus. Another one bites the dust.