Thoughts on @bengoldacre’s article this morning: I think Ben makes a very good point in pointing out that unemployment statistics are subject to sampling error, and that in many months, the change in unemployment is not statistically distinguishable for this reason.
However, I think Ben somewhat overstates this particular point. It’s not right to say, as he does in his final paragraph, that these statistics “tell us nothing”. We can’t dismiss all data that fails to pass the (entirely arbitrary) test of statistical significance. Non-statistically significant data contribute to our knowledge, but we shouldn’t put too much weight on them. (Ben knows this, of course: he pulled back a bit from this in a tweet to me, saying that what he really wanted was for economics reporters to acknowledge the lack of statistical significance, which would indeed be nice to see.)
But I liked Ben's final point, and he could have made more of it. There’s a lot of economic data out there that is noisy but for which sampling error bars are not appropriate. GDP data, for instance, is initially reported based on partial data (and not a random sample) and we gradually close in on a view of what GDP actually was. Sometimes we revise our views dramatically: Greece’s GDP (actual GDP, not GDP growth rate) jumped suddenly by 25 per cent late in 2006 thanks to a statistical revision, although I’ll admit Greece’s data is hardly the best in the world.
And what of the data that get far more airtime than every other economic variable put together – the movements of the stock markets? There is no sampling error here at all: the reporters really are able to tell us, minute by minute if they wish to, exactly what the stock markets are doing. As a consequence they do exactly that, producing reports that are full of sound and fury, signifying... well not nothing, but not a lot either. Stock market movements are extremely noisy, big money bets on the future of the economy. They can be reported with absolute precision. That doesn’t mean they should be.
His last paragraph is the most interesting, I think. Why do we always have to use data as ammunition in political debate? There seems to be no getting away from it, the left chucks it at the right, the right chucks it back with interesting additions and then there's the 'third team' of statistically significant scientist types with a global vision who berate both the left and the right for not taking things seriously and doing something to really fix whatever the problem is. But moving away from our team or tribal mentalities seems very difficult; very hard not to belong to some group and defend it. Why do certain topics (environment, immigration, capital) seem to 'belong' to the left or the right? If we really want to improve our societies, we need to look at the relationships between data, discourse, identities and world views.
Cowperthwaite had it right. To much emphasis on statistics as an excuse for central planners to meddle. Worked ok for Hon Kong! Remembering the TED talk on trial and error here...

I love the approach that relies on humility, principals and not on statistics or models.
There's also corroborating information in other labour market statistics - JSA numbers, vacancies, earnings growth - which form part of the conditioning set. Good to get away from idea that 'certainty' about economic data is achievable, though, as indeed about other types of data....
The fact that statistical significance is never reported contributes to people and non-economic journalists thinking that economists are akin to traditional scientists. They think economists should be able to predict GDP growth, inflation, etc to two decimal places, whereas in reality precision is rare.
Precision is rare in some 'traditional' sciences too, though.
All science has uncertainty inherent in the system. The major fault is that we teach our kids that science is certain, which is plainly false. We teach them that the more decimals a number has, the more precise it is. So is it any wonder that when the Unemp numbers come out to the hundredths place, that people think we can actually get that kind of precision? The kids then grow up thinking science can do things that it can't do.
@Diane True, I shouldn't have been so general. I'll avoid science comparisons. My point stands though, I often see (Irish) columnists/presenters say something along the lines of "well the economists said growth would be 2% this year but it turned out to be 1.75%, these guys are obviously useless".

@Justin That's it, there's a perception that economists (and other scientists) can do things that they simply can't. There doesn't seem to be an easy way to correct this. It damages perceptions of the usefulness of economics, etc.
Reporters diligence on reporting sampling error waxes and wanes on whether that sampling error makes the story more interesting or less. Thus, if Candidate A is polling at 47% and Candidate B is 52% with +- of 5% it'll be reported that they're tied. If unemployment is up 0.1% with a sampling error of 0.3% it'll be reported as the end of the world and surely a precursor to breadlines and Hoovervilles.
+Daniel Davis It's a problem in many of the science fields as well. With the current push of science science science (which I think is a good thing) people forget that science doesn't and probably will never have an answer to everything and even then, with science the answers change as new information comes to light. People seem to forget that there are a multitude of things we don't even know we are ignorant about. Think about the Atom. How many times in human history did we think we knew what the smallest stuff was? Oh then we realized we were ignorant of the nucleus, then how long til we realized there were neutrinos, bosons, etc....
+David Oser The problem with Economics is that it is wrapped in a a veil of Mathematics which people use to try and give it scientific plausibility. It's Scientism. F.A. Hayek knew all about it, which is why they don't teach Hayek in most Economics classes.
Absolutely about Hayek/mathematics. Hayek and the Austrian school are clearly making a come back (in part through the increasing popularity of Ron Paul). An over emphasis on statistics doesn't necessarily mesh well with "doing the right thing". For this we need values, moral thinking, and principal.

And the sovereignty of the individual, liberty vs the state does have it's appeals compared to a current corporatocracy which is failing. No wonder Keynes has been more popular in official economic circles, for a long time it has supressed these ideas, maintained a status quo and protected the elites.
BTW Statistics are awesome. But I wouldn't trust "official" statistics when they drive so much policy making, because they are often adjusted according to what policy makers want to do, and what the want to see. Far better to have "open source" stats - if this is even possible. Shadow stats does a good job of highlighting the massaging of stats for political ends;
Of course the whole field of econometrics - a large share of what the economics profession works on all day - is intimately concerned with these issues. So economists do know about it; but that's distinct from whether journalists, or politicians, have any interest in showing the error bars.

On this particular story, I suspect the ONS is being quite literal in reporting the confidence intervals. A bit more imagination can reduce the error substantially - but at a cost.

If you make a straight comparison of the unemployment number now (which was plus or minus 81k) with the equivalent unemployment number last year (assuming this is also plus or minus about 80k) you get a confidence interval of 111k, as the ONS says. However, if you were to include the quarterly data in between - which essentially give you three more observations - you could build a rolling 12-month average unemployment figure which would have smaller error bars. You could then calculate how this average changes from period to period.

This method reduces the resolution of the data - you can't directly compare quarterly figures because each quarter's rolling average shares nine months of data with the previous quarter's - but gives you more accuracy for annual comparisons. So one could calculate the change in unemployment from 2009-10 to 2010-11 with much greater confidence. I haven't worked out the figures, but as a rule of thumb, four times the data cuts the error bars in half - that is, errors of around 50k.

As Tim points out, you could also change the confidence interval and accept (say) 70% certainty instead of 95%.

Combining both these factors, you can be fairly confident that unemployment has crept up over the last year, though not by much. What you can't say is anything much about the specific changes between May and August 2011. It would be lovely to be able to look at policy changes in April 2011 and see the results in the unemployment figures for August, but without a much more expensive data collection exercise, we will never get that kind of insight.

Incidentally, if you look back into 2009 you get annual changes of 600,000 and upwards - far bigger than the error bars - and even some quarterly changes of over 200,000. So there are times when the changes really are clear and significant, even at short timescales.
Even when people reporting stories know there are margins of error involved, the reporting often isn't great; just look at reports of "movements" in the party standings in opinion polls which often greatly overstate the significance of changes within the margin of error. However, I suspect the much bigger issue in the case of economics statistics such as the unemployment ones is that relatively few people realise they involve any survey data at all.
+David Oser : "Use of mathematics per se does not make economics more 'scientific' and can obfuscate poor underlying explanations or theories"

True. But people sometimes don't realise that mathematics is a necessary, even if not sufficient, condition for economics to be scientific. In practice, to test economic propositions to the degree necessary to determine whether they are valid predictive rules, we do need to make them mathematically rigorous.
Indeed, any empirical subject will need to be 'mathematical' at some point in order to confront hypotheses with the statistical evidence in a reasonably rigorous way.
Hi there. As ever, my specific topical examples are chosen as an excuse to explain a piece of science or stats, in this case, sampling error. However I do think that the media routinely overstate the real-world significance of numbers that are in reality the result of random variation and statistical noise, and that my example was very reasonable. A few quick thoughts over lunch..

You say:

"It’s not right to say, as he does in his final paragraph, that these statistics “tell us nothing”. We can’t dismiss all data that fails to pass the (entirely arbitrary) test of statistical significance. Non-statistically significant data contribute to our knowledge, but we shouldn’t put too much weight on them."

I don't want to dismiss data, I want it clearly explained. Here's what I said: "I don’t know what’s happening to the economy: it’s probably not great. But these specific numbers tell us nothing"

I think that's perfectly sensible. I don’t think all imaginable numerical data on unemployment is uninformative, but the specific numbers quoted on unemployment in this BBC story were indeed uninformative. A quarterly change of 38,000 when the 95% confidence interval is ±87,000, running from -49,000 to 125,000, well, that specific number really is telling you very little indeed, it's highly likely to be due to chance, noise. An annual change of 32,000, ±111,000, well, that's even more likely to be chance. If you put these figures in the context of lots of other numbers of course you can start to build a picture (but then we are vulnerable to cherry picking, and seeing faces in the clouds, unless we have pre-specified ways of combining the data from all those numbers: I think this is a major, wider issue).

But the BBC piece that I hooked my explanation onto did not give this context. So I was surprised (and vaguely disappointed) to see Robert Peston's comment defending the BBC piece, in which he was factually incorrect:!/Peston/status/104876582581190656
"BBC story would have said "may not be a trend" which is what reader needs to know".

Robert Peston cannot have read the BBC piece that he is defending before posting this, because it does not include the caveat he says it does, and so I think his comment is unhelpfully dismissive. The BBC piece did not say "may not be a trend": it would have been great if it did, and it would be great if similar BBC pieces did so in future.

You also say, Tim: "We can’t dismiss all data that fails to pass the (entirely arbitrary) test of statistical significance."

p=0.05 is indeed an arbitrary cut-off for statistical significance, but it’s important not to handwave over what that actually means. When people say statistical significane is arbitrary, what they mean is, if something was 0.055 or 0.045 it would be a mistake to draw too much from it being one side or the other of the 0.05 line in the sand. But this absolutely doesn't mean that wanting any kind of statistical significance at all is arbitrary. What do you estimate the p-value of an annual change of 32,000, ±111,000 would be? I don't think it would be very near anyone's very flexible cut off for significance. Maybe it would be, if you want to do the maths and produce the p-value we can take a view!

Lastly, you and (to a greater extent) various others have suggested that full counts (such as the claimant count) are not amenable to statistical testing, or vulnerable to random variation, because they give numbers for the whole country, not a sample, and so are not subject to sampling error. But this is a mistake. A 1% change in a variable that is very noisy over time, and a 1% change in a historically stable variable, are importantly different phenomena and statistically testable.

In terms of what could be improved:

Obviously I don't want people to cease reporting numbers such as the Labour Force Survey figures. I simply think that they should give numerical and statistical context.

David Smith, Economics Editor of the Sunday Times, seems to say on Twitter that journalists must withold from readers the fact that figures are unreliable and prone to statistical error, simply because if readers were told about this reality, it would undermine the stories.!/dsmitheconomics/status/104868450987552769 "@TimHarford @bengoldacre Touching Ben's discovered statistics we use are imprecise. But can't see "not statistically significant" in stories"!/dsmitheconomics/status/104877576065007616 "bengoldacre Not sure it would inform better. You'd have to apply it to most stories using stats and risk saying: "Don't bother reading.""

This is not my view of readers, I think it understates peoples’ intelligence, and what they want from newspapers. Even if I’m wrong about readers, David Smith's comments still raise concerns about the choices journalists make over what information to give to readers, and what drives it.

I don’t think people are stupid, and I think they deserve access to good quality information that is clearly explained!
I think many economic journalists simply mix up the terms "precision" and "accuracy" :)

These terms have very different meanings, imprecise figures are often stated precisely, implying accuracy!
