Profile cover photo
Profile photo
Laurent Bossavit
905 followers
905 followers
About
Laurent's posts

Post has shared content
Ruby Bloom recently posted about the significance of +Eliezer Yudkowsky's Less Wrong Sequences on his thinking. I felt compelled to do the same.

Several people have explicitly told me that I'm one of the most rational people they know. I can also think of at least one case where I was complimented by someone who was politically "my sworn enemy", who said something along the lines of "I do grant that your arguments for your position are good, it's just everyone else on your side...", which I take as some evidence of me being able to maintain at least some semblance of sanity even when talking about politics.

(Seeing what I've written above, I cringe a little, since "I'm so rational" sounds like so much like an over-the-top, arrogant boast. I certainly have plenty of my own biases, as does everyone who is human. Imagining yourself to be perfectly rational is a pretty good way of ensuring that you won't be, so I'd never claim to be exceptional based only on my self-judgment. But this is what several people have explicitly told me, independently of each other, sometimes also vouching part of their own reputation on it by stating this in public.)

However.

Before reading the Sequences, I was very definitely not that. I was what the Sequences would call "a clever arguer" - someone who was good at coming up with arguments for their own favored position, and didn't really feel all that compelled to care about the truth.

The one single biggest impact of the Sequences that I can think of is that before reading them, as well as Eliezer's other writings, I didn't really think that beliefs had to be supported by evidence.

Sure, on some level I acknowledged that you can't just believe anything you can find a clever argument for. But I do also remember thinking something like "yeah, I know that everyone thinks that their position is the correct one just because it's theirs, but at the same time I just know that my position is correct just because it's mine, and everyone else having that certainty for contradictory beliefs doesn't change that, you know?".

This wasn't a reductio ad absurdum, it was my genuine position. I had a clear emotional certainty of being right about something, a certainty which wasn't really supported by any evidence and which didn't need to be. The feeling of certainty was enough by itself; the only thing that mattered was in finding the evidence to (selectively) present to others in order to persuade them. Which it likely wouldn't, since they'd have their own feelings of certainty, similarly blind to most evidence. But they might at least be forced to concede the argument in public.

It was the Sequences that first changed that. It was reading them that made me actually realize, on an emotional level, that correct beliefs actually required evidence. That this wasn't just a game of social convention, but a law of universe as iron-clad as the laws of physics. That if I caught myself arguing for a position where I was making arguments that I knew to be weak, the correct thing to do wasn't to hope that my opponents wouldn't spot the weaknesses, but rather to just abandon those weak arguments myself. And then to question whether I even should believe that position, having realized that my arguments were weak.

I can't say that the Sequences alone were enough to take me all the way to where I am now. But they made me more receptive to other people pointing out when I was biased, or incorrect. More humble, more willing to take differing positions into account. And as people pointed out more problems in my thinking, I gradually learned to correct some of those problems, internalizing the feedback.

Again, I don't want to claim that I'd be entirely rational. That'd just be stupid. But to the extent that I'm more rational than average, it all got started with the Sequences.

Ruby wrote:

> I was thinking through some challenges and I noticed the sheer density of rationality concepts taught in the Sequences which I was using: "motivated cognition", "reversed stupidity is not intelligence", "don't waste energy of thoughts which won't have been useful in universes were you win" (possibly not in the Sequences), "condition on all the evidence you have". These are fundamental concepts, core lessons which shape my thinking constantly. I am a better reasoner, a clearer thinker, and I get closer to the truth because of the Sequences. In my gut, I feel like the version of me who never read the Sequences is epistemically equivalent to a crystal-toting anti-anti-vaxxer (probably not true, but that's how it feels) who I'd struggle to have a conversation with.

> And my mind still boggles that the Sequences were written by a single person. A single person is responsible for so much of how I think, the concepts I employ, how I view the world and try to affect it. If this seems scary, realise that I'd much rather have my thinking shaped by one sane person than a dozen mad ones. In fact, it's more scary to think that had Eliezer not written the Sequences, I might be that anti-vaxxer equivalent version of me.

I feel very similarly. I have slightly more difficulty pointing to specific concepts from the Sequences that I employ in my daily thinking, because they've become so deeply integrated to my thought that I'm no longer explicitly aware of them; but I do remember a period in which they were still in the process of being integrated, and when I explicitly noticed myself using them.

Thank you, Eliezer.

(There's a collected and edited version of the Sequences available in ebook form: https://smile.amazon.com/Rationality-AI-Zombies-Eliezer-Yudkowsky-ebook/dp/B00ULP6EW2/ . I would recommend trying to read it one article at a time, one per day: that's how I originally read the Sequences, one article a day as they were being written. That way, they would gradually seep their way into my thoughts over an extended period of time, letting me apply them in various situations. I wouldn't expect just binge-reading the book in one go to have the same impact, even though it would likely still be of some use.)

Post has attachment
Degrees of intellectual dishonesty

In the previous post, I said something along the lines of wanting to crawl into a hole when I encounter bullshit masquerading as empirical support for a claim, such as "defects cost more to fix the later you fix them".

It's a fair question to wonder why I should feel shame for my profession. It's a fair question who I feel ashamed for. So let's drill a little deeper, and dig into cases.

Before we do that, a disclaimer: I am not in the habit of judging people. In what follows, I only mean to condemn behaviours. Also, I gathered most of the examples by random selection from the larger results of a Google search. I'm not picking on anyone in particular.

The originator of this most recent Leprechaun is Roger S Pressman, author of the 1987 book "Software Engineering: a Practitioner's Approach", now in its 8th edition and being sold as "the world's leading textbook in software engineering".

Here is in extenso the relevant passage (I quote from the 5th edition, but have no reason to think it changed in any way from the 1st.)

To illustrate the cost impact of early error detection, we consider a series of relative costs that are based on actual cost data collected for large software projects [IBM81]. Assume that an error uncovered during design will cost 1.0 monetary unit to correct. Relative to this cost, the same error uncovered just before testing commences will cost 6.5 units; during testing, 15 units; and after release, between 60 and 100 units.

This [IBM81] is expanded, in the References section of the book, into a citation: “Implementing Software Inspections,” course notes, IBM Systems Sciences Institute, IBM Corporation, 1981.

Am I embarrassed for Pressman, that is, do I think he's being intellectually dishonest? Yes, but at worst mildly so.

It's bothersome that for the first edition Pressman had no better source to point to than "course notes" - that is, material presented in a commercial training course, and as such not part of the "constitutive forum" of the software engineering discipline.

We can't be very harsh on 1987-Pressman, as software engineering was back then a discipline in its infancy; but it becomes increasingly problematic as edition after edition of this "bible" lets the claim stand without increasing the quality of the backing.

Moving on, consider this 1995 article: http://sci-hub.cc/10.1007/BF00402646

"Costs and benefits of early defect detection: experiences from developing client server and host applications", Van Megen et al.

This article doesn't refer to the cost increase factors. It says only this:

"To analyse the costs of early and late defect removal one has to consider the meaning and effect of late detection. IBM developed a defect amplification model (IBM, 1981)."

The citation is as follows:

"IBM (1981) Implementing Software Inspections, course notes (IBM Systems Sciences Institute, IBM Corporation) (summarised in Pressman 1992.)"

This is the exact same citation as Pressman's, with the added "back link" to the intermediate source. The "chain of data custody" is intact. I give Van Megen et al. a complete pass as far as their use of Pressman is concerned.

Let's look at a blog post by my colleague Johanna Rothman: http://www.jrothman.com/articles/2000/10/what-does-it-cost-you-to-fix-a-defect-and-why-should-you-care/

Johanna refers, quite honestly, to "hypothetical examples". This means "I made up this data", and she's being up front about it. She says:

"According to Pressman, the expected cost to fix defects increases during the product’s lifecycle. [...] even though the cost ratios don’t match the generally accepted ratios according to Pressman, one trend is clear: The later in the project you fix the defects, the more it costs to fix the defects."

I'm almost totally OK with that. It bothers me a bit that one would say "one trend is clear" about data that was just made up; we could have made the trend go the other way, too. But the article is fairly clear that we are looking at a hypothetical example based on data that only has a "theoretical" basis.

The citation:

Pressman, Roger S., Software Engineering, A Practitioner’s Approach, 3rd Edition, McGraw Hill, New York, 1992. p.559.

This is fine. It's a complete citation with page number, still rather easy to check.

I am starting to feel queasy with this 2007 StickyMinds article by Joe Marasco: https://www.stickyminds.com/article/what-cost-requirement-error

"The cost to fix a software defect varies according to how far along you are in the cycle, according to authors Roger S. Pressman and Robert B. Grady. These costs are presented in a relative manner, as shown in figure 1."

What Grady? Who's that? Exactly what work is being cited here? There's no way to tell, because no citation is given. Also, the data is presented as fact, and a chart, "Figure 1" is provided which was not present in the original.

This is shady. Not quite outright dishonest, but I'd be hard pressed to describe it more generously than as "inaccurate and misleading".

A different kind of shady is this paper by April Ritscher at Microsoft. http://www.uploads.pnsqc.org/2010/papers/Ritscher_Incorporating_User_Scenarios_in_Test_Design.pdf

The problem here is a (relatively mild) case of plagiarism. The words "the cost to fix software defects varies according to how far along you are in the cycle" are lifted straight from the Marasco article, with the "according to" clause in a different order. But the article doesn't give Marasco credit for those words.

There's also the distinct possibility that Ritscher never actually read "Pressman and Grady". Do I have proof of that? No, but it is a theorem of sorts that you can figure out the lineage of texts by "commonality of error". If you copy an accurate citation without having read the original, nobody's the wiser. But why would you go to the trouble of reproducing the same mistake that some random person made if you had actually read the original source?

So we're entering the domain of intellectual laziness here. (Again, to stave off the Fundamental Attribution Error: I am not calling the person intellectually lazy; I am judging the behaviour. The most industrious among us get intellectually lazy on occasion, that's why the profession of tester exists.)

Next is this 2008 article by Mukesh Soni: https://www.isixsigma.com/industries/software-it/defect-prevention-reducing-costs-and-enhancing-quality/

"The Systems Sciences Institute at IBM has reported that the cost to fix an error found after product release was four to five times as much as one uncovered during design, and up to 100 times more than one identified in the maintenance phase (Figure 1)."

We find the same level of deceit in a 2008 thesis, "A Model and Implementation of a Security Plug-in for the Software Life Cycle " by Shanai Ardi. http://www.diva-portal.org/smash/get/diva2:17553/FULLTEXT01.pdf

"According to IBM Systems Science Institute, fixing software defects in the testing and maintenance phases of software development increases the cost by factors of 15 and 60, respectively, compared to the cost of fixing them during design phase [50]."

The citation is missing, but that's not really what's important here. We've crossed over into the land of bullshit. Both authors presumably found the claim in the same place everyone else found it: Pressman. (If you're tempted to argue "they might have found it somewhere else", you're forgetting my earlier point about "commonality of error". The only thing the "IBM Systems Science Institute" is known for is Pressman quoting them; it was a training outfit that stopped doing business under that name in the late 1970's.)

But instead of attributing the claim to "IBM, as summarized by Pressman", which is only drawing attention to the weakness of the chain of data custody in the first place, it sounds a lot more authoritative to delete the middle link.

I could go on and on, so instead I'll stop at one which I think takes the cake: "ZDLC for the Early Stages of the Software Development Life Cycle", 2014: http://sci-hub.cc/10.1109/DCABES.2014.5#

"In 2001, Boehm and Basili claimed that the cost of fixing a software defect in a production environment can be as high as 100 times the cost of fixing the same defect in the requirements phase. In 2009, researchers at the IBM Systems Science Institute state that the ratio is more likely to be 200 to 1 [7], as shown in Figure 2".

The entire sentence starting "In 2009" is a layer cake of fabrication upon mendacity upon affabulation, but it gets worse with the citation.

Citation [7] is this: "Reducing rework through effective requirements management", a 2009 white paper from IBM Rational. Available here: http://www.edn.com/Pdf/ViewPdf?contentItemId=4210043

Yes, at the century scale IBM Rational is a contemporary with the defunct IBM Systems Science Institute, but that's a little like attributing a Victor Hugo quote to Napoleon.

While Figure 2 comes straight out of the IBM paper, the reference to "IBM Systems Science Institute" comes out of thin air. And in any case the data does not come from "researchers at IBM", since the IBM paper attributes the data to Boehm and Papaccio's classic paper "Understanding and Controlling Software Costs", which was published not in 2009 but in 1988. (Both of them worked at Defense consultancy TRW.)

We've left mere "bullshit" some miles behind here. This isn't a blog post, this an official peer reviewed conference with proceedings published by the IEEE, and yet right on the first page we run into stuff that a competent reviewer would have red-flagged several times. (I'm glad I've let my IEEE membership lapse a while ago.)

Garden-variety plagiarism and bullshit (of which we are not in short supply) make me feel icky about being associated with "software engineering", but I want to distance myself from that last kind of stuff as strongly as I possibly can. I cannot be content to merely ignore academic software engineering, as most software developers do anyway; I believe I have an active duty to disavow it.

Post has attachment
This is just how embarrassed I am for my entire profession

So here I was idly looking at Twitter, when Scott Nickell innocently poked me, regarding one more instance of the old "cost of defects" chestnut:

"I can't tell if this "Systems Sciences Institute at IBM" thing is a new study, or just the same-old." https://dzone.com/articles/the-cost-of-poor-software-quality-infographic

I was feeling lazy, so I encouraged Scott to apply the usual Leprechaun hunting process: "Here's how you could tell: Google exact phrase for a portion of the article citing it, then note the publication dates of hits." (Try it yourself: https://www.google.com/search?q=%22cost+to+fix+an+error+found+after+product+release%22&oq=%22cost+to+fix+an+error+found+after+product+release%22&aqs=chrome..69i57.7165j0j4&sourceid=chrome&ie=UTF-8)

Scott replied after a few minutes: "Well, at a quick glance, I traced it as far as a blog post from about 2008. That's enough to make me confident it's nothing new."

But somehow I felt we shouldn't stop there. I laid off Twitter for a moment and had a quick look at Google Scholar: https://scholar.google.fr/scholar?hl=en&q=%22Systems+Sciences+Institute%22+cost+fixing+defects&btnG=&as_sdt=1%2C5&as_sdtp=

Notice anything? Strangely enough, the "Systems Science Insitute" is only ever cited for one "result": the aforementioned bogus numbers about cost of defects.

My curiosity piqued, I tried looking for any contemporary evidence of the existence of this "Systems Science Insitute" at IBM, and could find none. The IBM web site's search box returns zero hits for that name, for instance.

I was eventually able to track down, in a 2009 obituary for the IBM Systems Journal, some evidence for the existence of something called "Systems Research Institute" at IBM: http://smartphonestechnologyandbusinessapps.blogspot.fr/2009/06/rip-ibm-systems-journal-1962-2009.html

Meanwhile, Scott helpfully prodded me into looking at result #5 on the Google Scholar list, which mentions those results as being "summarized in Pressman 1992". I know that book - I've run into it a lot, so I own an ebook copy now: "Software Engineering, a Practitioner's Approach".

Looking it up, Pressman cites IBM as follows: "_Implementing Software Inspections._ course notes, IBM Systems Sciences Institute, IBM Corporation, 1981"

Wait a minute: course notes?

What's worse, here's how Pressman introduces the data on cost of defects (emphasis mine): "To illustrate the cost impact of early error detection, we consider a series of relative costs that are based on actual cost data collected for large software projects [IBM81]."

Pressman adds in a footnote: "Although these data are more than 20 years old, they remain applicable in a modern context." Apparently many people in 2016 still believe with Pressman that 35 year old data are still relevant to a context that has seen such upheavals as the personal computer and the Internet.

But the thing that sticks with me is "course notes". This is essentially an admission that this so-called data was recalled from memory (and quite possibly poorly recalled, as the Systems Research/Sciences approximation suggests).

So here we have the telephone game again - some IBM instructor gave a course in 1981, Pressman wrote up numbers "based on" the numbers from that course a few years later, everyone else quoted Pressman as gospel and most of them deleted the somewhat inconvenient "course notes". It became "a report from IBM", and appears as such for instance in the book "Agile Testing" by my colleagues Lisa Crispin and Janet Gregory.

Citing "reports" from non-existent "institutes" isn't even the worst offense to common sense committed on a routine basis in my profession - it's just the latest example to make me want to crawl into a hole.

Not for the first time, I get this feeling that everyone in this profession is making it up as they go along, and the entire edifice of "software engineering" (as a supposed academic discipline) is the Emperor's brand new clothes.

Maybe we all need to become little kids again before it can get any better?





But talented programmers DO exist!

Below is my reply to a reader of Leprechauns, who said they liked the book but thought I was in the wrong on "10x programmers" - they'd actually met one.

It would be silly to deny the existence of talent. And it would be just as silly to lump the world into such broad categories that we couldn't distinguish between concepts as widely separated as "talent" on the one hand, and "productivity" on the other.

Some people are talented. They approach their art with a style which is uniquely and recognizably theirs; part of the trace they leave upon the world is that their art is forever changed after them; everything that follows gets compared to what they did.

Some people are "productive", in the vulgar sense of there being many works attributed to them. (We may prefer the word "prolific" here.)

Some people are talented but not productive: Kubrick comes to mind. Some are productive, and can be called talented, but not everything they did shows the same talent: I'd put Woody Allen in that category. Few shine both long and bright.

There are programmers who are both talented in the above sense, and "productive" in the vulgar sense, that many works can be attributed to them. Fabien Bellard is one example. (Perhaps not all shine as bright as the talented people we can name in other arts, possibly because programming is yet only on its way to becoming a major art: few people study the works of Fabien Bellard in the same way that people study the works of Mozart. Few people, alas, study the work of any programmer - perhaps least of all programmers themselves.)

With all of the above I have no problem.

Where I start having a problem is when the above senses of "talented" or "productive" become lumped in with a second sense of "productive": the sense in which you can measure the productivity of industrial apparatus, or of industrial systems in whole or in part, as in the phrase "the productivity of a worker". We have to decide what we are talking about - industrial economics, or the works of creative individuals.

It would be silly to say that Kubrick is 10x or 2x or 0.5x the filmmaker that Allen is. This is not the sense of "productive" that lends itself to comparison on a numerical scale.

Every time someone points to a "study" supposedly supporting the concept of highly productive programmers, they turn out to be supporting a notion of measuring some equivalent of the number of lines of code written per unit time; that is, the narrowly economic sense of "productivity". This might be a valid construct, but it should not be lumped in together with the other sense in which some talented individuals are "productive" - that is, "prolific".

And lump them together is precisely what "10x programmer" discourse encourages doing. It presupposes that you can hire a talented programmer to work on what you want done, and they will turn out ten times the "amount of work" (fungible work, not individual works) than a run-of-the-mill programmer will.

This is silly, because these talented programmers, if you ask them to work on your thing, will tell you what Kubrick or Allen would have said if you'd asked them to produce a movie on commission. They would have told you, perhaps even politely, to stuff it.

Further, the "10x programmer" concept presupposes that the production of one can be compared to the production of another, on a single scale, in precisely the sense that Kubrick and Allen's works cannot be compared.

This is silly, because a program is not a bunch of lines of code cranked out, machine-like; it is a socio-technical object existing within a broader context. To be valuable it must be used, to be used it must be distributed, users somehow trained, and so on. You can no more numerically compare the contribution of different programmers to different programs that you can numerically compare Nicole Kidman's "productivity" in Eyes Wide Shut to Scarlett Johansson's in Scoop.

I hope this clarifies why I do not feel that acknowledging the existence of talented or prolific individuals is incompatible with my critique of the concept of "10x programmer", and the mythology that has grown around that concept.

I don't feel that dismantling that mythology belittles the work of talented programmers; my inclination would be to magnify that work - by highlighting their creative individuality.

Post has attachment
Forecasting the Future of Employment

(Follow-up on https://plus.google.com/u/1/+LaurentBossavit/posts/is8vMdyXbuU)

So how would a superforecaster think about issues like the risk of job loss to computerisation?

The first thing I would do is fix the timeframe and pin down the exact meaning of the claim. The objective is to remove ambiguity, at the cost of accepting that the resulting question may no longer be exactly what we started with.

The Oxford study computes a .99 probability that "Telemarketers" will be replaced by automated technology. "That sounds plausible," you might be thinking. After all, robocalling is on the rise, and seemingly inexorable.

As we saw in the previous instalment, the first thing we need to do is specify a time frame. Instead of "the next decade or two", let's go with "by 2025". Instead of the vague "replaced by technology", let's stipulate that the question will be answered Yes if and only if a reliable source indicates a tenfold reduction in that part of the workforce.

(This is generous towards the Oxford study. Jobs disappear for reasons other than automation, such as going to cheaper countries, and we would count those as a win for the automation study. That is one of the ways in which we'll accept a slightly different question than we started out with for the sake of precision.)

We will even pin down what reliable source. Since the study relies on BLS statistics, we'll use the BLS page: http://www.bls.gov/oes/current/oes419041.htm

So, our revised claim is:

There is a 99% probability that by 2025, the BLS will report fewer than 23,452 people employed in the "Telemarketer" category.

What do we mean exactly by "99% probability"? It means that out of 100 times you expressed a judgement at this level of probability, you expect to be wrong exactly once.

Let's put it another way. If the BLS reports more than 23K telemarketers in the US in 2025, you will pay me $100. If the BLS reports fewer (or stops reporting the category altogether), I will pay you $1 and one cent. (All sums adjusted for inflation.)

Mathematically, the expected value of this bet is zero - if you are correct. If you are estimating the probability conservatively (rounding off from 99.9%, say) then this is a winning bet for you. If you are overestimating the probability, then this is a good bet for me.

Would you bet $100 to $1 that the number of "Insurance Underwriters", estimated by the Oxford study to be at 99% risk of being automated away, will go down to about ten thousand from today's count of 106,300 by 2025? The BLS itself projects an outlook of a 6% decrease by 2022; you would be betting against the BLS, which presumably knows what it's talking about.

A 99% probability in this kind of domain strikes me as waaaaay overconfident. There are about 10 of these in the Oxford study; this implies a 90% probability (.99 to the 10th power) that all 10 have been lost to automation by 2025. This means you should be willing to take a bet to pay me $10K if any of these jobs are still shown by the BLS to employ more than 10% of the current counts, and I'll pay you a round $1000 if and only if all of these jobs are gone:

- Data Entry Keyers
- Library Technicians
- New Accounts Clerks
- Photographic Process Workers and Processing Machine Operators
- Tax Preparers
- Cargo and Freight Agents
- Watch Repairers
- Insurance Underwriters
- Mathematical Technicians
- Sewers, Hand
- Title Examiners, Abstractors, and Searchers
- Telemarketers

Would you take that bet?

Would you still take that bet, assuming you said "yes" the first time around, after I told you that hand sewers ("Sewers, Hand") saw their numbers only halved between 2005 and 2015?

(If I were making an actual forecast, I'd certainly look at this kind of evolution - I'd take the assumption that the next decade is likely to be much like the past decade as a starting point, and adjust according to current information. It's kind of weird that the authors of the Oxford study didn't even mention, that I can see, this kind of simple cross-check against their algorithmic model.)

I'm willing to put my money where my mouth is, by the way. I'm not a gambler, or willing to keep track of a bunch of bets, so I'd only take the bet once... but I would take it to show I'm serious about this kind of thing.

Post has attachment
The Future of Employment?

Here's a study which has been making the rounds recently. It seems destined for Leprechaun status.

Its claimed bottom line: "47% of US employment is at risk of computerisation in the next two decades".

So, earlier today, when I saw yet another tweet about this without qualifying language or critical examination, I pushed back. This was met with almost a textbook case of the Leprechaun Objection: "It was the "best" [study] we saw, but would love to hear of better ones! Any references?"

The usual answer applies: there surely is a "best" study out there on the ecology of leprechauns. But leprechauns still don't exist.

Before I go into some specific criticism of the study - or more accurately, of how the abstract of the study, and hence the news, frame its conclusions - I would like you to pause and think for a few minutes about two questions.

First, what does it mean to you that a given job is "at risk of being computerised in the next few decades"?

Second, if it was up to you to measure the probability that a given job would be computerised in that time frame, how would you go about it?

The latter question is a matter of forecasting. This is the topic of Phil Tetlock's latest book, Superforecasting (warmly recommended). The "super" in the title refers to the fact that people can be trained, apparently, to be much better at forecasting (accurately assessing the probability of specific future events) than the rest of the population. Also, some personality traits seem to predict "super" forecasting skills.

I happened to be among the top 2% of the participants in Tetlock's studies, hence a "superforecaster". I mention all this to establish that I know a thing or two about forecasting, and spent some time thinking seriously about "probabilities" and what the word means.

Now, back to the Oxford study.

The "two decades" thing is largely made up. The study is about occupations that "are potentially automatable over some unspecified number of years". The text goes on to say "perhaps a decade or two", but only by way of illustrating this vague timeframe. It could also be a century or two.

This is one of the things I learned about forecasting - unless you're specifying a well-defined time frame, it's close to impossible to assess the accuracy of forecasts.

But now the meat of the thing - what they mean by "probability". It turns out that the study didn't measure probability at all. The study was based on a set of subjective, binary assignments by the researchers of whether a job was "computerisable". They coded these jobs as 0 (can't be computerised) or 1 (is certain to be computerised).

But the study also admits: "We thus acknowledge that it is by no means certain that a job is computerisable given our labelling." So... the study coded as 0 and 1 judgments that were both subjective and uncertain. (This breaks yet another tenet of good forecasting: there are no blacks and whites - no 0s and 1s - but only shades of gray.)

The study started from subjective assessments of the research team over a small sample of occupations, and asked whether these assessments correlated in any way with "official" characteristics of the jobs in question, such as the job's requirements for manual, cognitive or social skills.

The study authors call these characteristics "objective" because they were given to them by the Bureau of Labor Services. It would be more honest to say that these characteristics were also "subjective" - but at worst tainted by someone else's subjectivity.

What the study did apparently demonstrate (I might yet look more closely into that part, the math-heavy part) is that the subjective assessments correlated rather well with the job characteristics; that is, once you know how much a given job relies on manual, social and cognitive skills respectively, you can reasonably well predict whether the researchers will think it is computerisable.

To which my reaction is: "Well, d'oh!".

The term "computerisable" reflects the anxieties of the age. We have seen jobs disappear and others be created. We also have our subjective but socially informed notions of what jobs require what kinds of skills. It's interesting, but not overly surprising, that these two sets of prejudices match up with each other.

But really, we haven't learned much about "what will happen in the next decade or two". And the study's predictions are so vague that I forecast a very low probability that they will ever be properly tested. To a very high likelihood (take it from a superforecaster) they will remain empty punditry.

Post has attachment
Musings on the Cone of Uncertainty

A comment on a previous post objects to my comparing the weather version of the "cone of uncertainty" to the well-known one in software development that I've attempted to debunk for a while: "As well-managed projects progress, the number of variables is reduced [...] weather is a poor analogy [because] weather systems are full of uncontrolled, poorly understood variables, without progression towards a 'finish point'".

Does the notion of a "well-managed project" have any kind of predictive validity? Or is it something we assess after the fact?

It's easy to observe an effort that has little residual uncertainty, for instance because it's shipped to production or has become a commercial success, then turn around and see in its past a "well-managed project" or a nicely shaped cone of uncertainty. But this might well be due to survival bias and selective attention.

The question I'm asking is, if we attempt to draw cones while a project is ongoing, are there any project characteristics that let us reliably anticipate seeing a steadily narrowing of the uncertainties?

Software projects also are "full of uncontrolled, poorly understood variables". We call them "people". Each of these people has his or her own "finish point" that they are striving for, and they're often poorly aligned.

Irrespective of how well the analogy with the weather holds up, the weather cone is at least drawn in the direction that makes more sense to me: with the experienced present as a known point and the uncertain future as a range of possibilities.

The way the Boehm-McConnell cone is drawn has narrative fallacy written all over it. Its future finish point is really someone's present, when the project is delivered successfully and they look back at what a wild ride it has been. It's always going to look like a cone because the further back into their own past they look, the harder it was back then to imagine reaching this particular "finish point".

Two years ago, I could not possibly have imagined that I would end up, today, working for the French government helping them transform project management practice towards Agile. Even a few months ago the prospect felt like a weird gamble. Yet here I am, doing that. My own sense of purpose dictates that I construct some kind of retrospective consistency: I must yield to the temptation of reinterpreting the past few years as inexorably leading up to that point in my life.

Tempting and even useful as that view is, it's still false. That's what the Cone feels like, to me.

Post has attachment
Destroying the entire US economy

I've been trying to wrap my head around a concept in economics, and you should know that I've no background in economics. Like, at all. My econ teacher back in high school would have been unanimously voted Worst Teacher if we'd had an election; he was so bad we skipped his classes in total impunity.

Anyway, my question was: when you hear "X costs the economy N billions of dollars per year", what specifically do you take that to mean? X is variously given as "disengaged employees", "preventable heart disease", "software bugs", and so on. It's entirely possible that claims of that sort make sense for some X, and not for others.

Does it mean, for instance, "in the absence of X there would be Y $Bn more wealth to share around"? That doesn't quite compute for me, because (in some of the cases I gave, such as software bugs) those Y billions are salaries or fees paid out to people, so are in the economy.

Someone on Twitter suggested it means "people/companies/the gov't spend Y but it doesn't produce useful returns". But then what specifically is meant by not saying that, and saying instead it's "a cost to the economy"?

Alternately, can anyone provide an example of an X that was eliminated and we were able to measure the costs of X recovered to the economy?

Being who I am, my hunch was that "X costs the economy Y" is actually a snowclone, meaning "X is bad" for any value of Y, otherwise empirically meaningless. What you do when you find a snowclone is look for examples, and I was able to find plenty.

What I found was interesting. I tabulated the results in a spreadsheet. If you sum all "costs to the US economy" you get a number larger than the economy is to start with.

Of course there's no sensible reason to count down from the total, and subtract these costs. It's obvious that a bunch of these are counterfactual: "if we stopped X it would add Y dollars to the overall bottom line". But just as obviously that framing is far less potent, because it makes clear its estimate is counterfactual, speculative, uncertain; whereas a "cost" is implied to be a solid figure.

There's no sensible reason for calling these things "costs" either; I don't go around bemoaning the cost I suffered by not becoming President of France, and thus not being able to get paid $100K for speaking at a conference - a total "loss" to me, so far, of over $6M.

Anyway, if you happen to make up a number for what your favorite problem costs the economy, I've got a handy spreadsheet of items to compare it to. For instance, it's more urgent to address "routine weather variability" than software bugs. You can all relax about using PHP or whatever.

Here's the link, for your convenience or amusement.

Post has attachment
The Myth of the Myth of the Myth of 10x

Alan, over at Tooth of the Weasel, has a blog on "The Myth of the Myth of 10x", defending the old idea of "10x programmers". (It's not recent, but new to me and I was recently pointed to it after Steve McConnell commented to say, basically, "Hell yeah.")

As anyone knows who knows me a bit, I don't think the 10x concept has any credibility. But I'm open to new data and reasoning on the topic.

Also, Alan's post gave me a good opportunity to write up a bit of old history, as I like to do, that few people are aware of. So even if you're down with the whole "someone's wrong on the Internet" thing, read on for that juicy tidbit at least.

Alan's reasoning, as far as I could tell, appears to be "the 10x concept is not a myth because I define it differently from the way it was defined in the studies that are claimed to support the 10x concept".

I don't think this works. What would work for me: Alan's describing someone he's actually met (or has reliable information about), who fits his definition of "someone whose aptitudes allow them to deliver significantly higher output and quality", and some explanation of why they are that way. That would still be anecdotal evidence, but better than no evidence at all.

In a blog post a few years back (http://www.construx.com/10x_Software_Development/Chief_Programmer_Team_Update/) Steve has described what some might call "the original 10x programmer", Harlan Mills. According to Steve, "Harlan Mills personally wrote 83,000 lines of production code in one year" on a project for the New York Times in the early 70s.

I think this qualifies Mills as a "very prolific" programmer. One issue with that descriptor is that, as Alan acknowledged, "prolific" isn't the same as "productive" (and it's one of the tragedies of our profession that we consistently fail to distinguish the two). We all know people who churn out reams of code that turns out to be worthless.

It turns out Mills was one of those people.

At least he was on the particular project Steve describes as "one of the most successful projects of its time". By the way, you don't have to claim that "all programmers are about the same" to make a counter claim to the 10x concept; you can for instance merely point out that if programmers are extremely inconsistent in their performance, that would explain the data in the 10x studies just as well.

Maybe Mills was a 10x on some other project, but my research suggests he wasn't a 10x in Alan's sense of "significantly higher output and quality" on the Times project.

Stuart Shapiro, in his 1997 article "Splitting the Difference", described the same project somewhat differently:

"As evidence, the authors pointed to the development of an information bank for the New York Times, a project characterized by high productivity and very low error rates. Questions were raised, however, concerning the extent to which the circumstances surrounding the project were in fact typical. Moreover, it seems the system eventually proved unsatisfactory and was replaced some years later by a less ambitious system."

Source: http://sunnyday.mit.edu/16.355/shapiro-history.pdf

Shapiro is quoting from a much, much older article that appeared in Datamation in May 1977, "Data for Rent" by Laton McCartney:

"Unfortunately for The Times, the IBM designed system didn't prove to be the answer either. 'They touted us on top down structured programming', says Gordon H. Runner, a VP with The Information Bank, 'but what they delivered was not what they promised.' When the FSD system proved unsatisfactory, the TImes got rid of its IBM 370/148 and brought in a 360/67 and a DEC PDP-11/70. Further, Runner and his staff designed a system that was less ambitious than its predecessor but feasible and less costly. [...] 'With the new approach we're not trying to bite off the state of the art,' Runner explains. 'We're trying to deliver a product.'"

(The PDF for the Datamation article isn't available online, but I'm happy to provide it upon request.)

I find it ironic and funny that "the original 10x programmer" left behind such a bitter taste in his customer's mouth. It reminds me of the ultimate fate of the Chrysler C3 project that was the poster boy for Extreme Programming.

Our profession has long been driven by fad and fashion, with its history written not by the beneficiaries or victims of the projects on which we try new approaches, but by the people most biased to paint those projects and approaches in a good light. Our only way out of this rut is to cultivate a habit of critical thinking.

(I've written a lot more about the 10x myth, and my reasoning for branding it a myth, in my book: https://leanpub.com/leprechauns - if you found the above informative, check it out for more of that.)

Post has shared content
I received notification yesterday that two of my abstracts were accepted to the Toward a Science of Consciousness 2015​ conference:

* Sentient companions predicted and modeled into existence: explaining the tulpa phenomenon. Accepted as a contributed poster. http://kajsotala.fi/Papers/Tulpa.pdf

Takes a stab at trying to explain so-called "tulpas", or intentionally created imaginary friends, based on some of the things we know about the brain's cognitive architecture.

* Coalescing Minds and Personal Identity. Accepted as contributed paper; co-authored with Harri Valpola. http://kajsotala.fi/Papers/CoalescingPersonalIdentity.pdf

Summarizes our earlier paper, Coalescing Minds (2012), which argued that it would in principle not require enormous technological breakthroughs to connect two minds together and possibly even have them merge. Then says a few words about the personal identity implications. Due to the word limit, we could only briefly summarize those implications: will have to cover the details in the actual talk.
Wait while more posts are being loaded