Recent Big-Data Struggles Are ‘Birthing Pains,’ Researchers Say

"This month Mr. [David] Lazer [a professor of political science and computer science at Northeastern University] published a new Science article that seemed to dump a bucket of cold water on such data-mining excitement. The paper dissected the failures of Google Flu Trends, a flu-monitoring system that became a Big Data poster child."
. . . . 
"The reaction, from some, boiled down to this: Aha! Big Data has been overhyped. It’s bunk.

"Not so, says Mr. Lazer, who remains 'hugely' bullish on Big Data. 'I would be quite distressed if this resulted in less resources being invested in Big Data,' he says in an interview. Mr. Lazer calls the episode 'a good moment for Big Data, because it reflects the fact that there’s some degree of maturing. Saying "Big Data" isn’t enough. You gotta be about doing Big Data right.'"
. . . .
"'Most big data that have received popular attention,' Mr. Lazer wrote in Science, 'are not the output of instruments designed to produce valid and reliable data amenable for scientific analysis."
. . . .
"Past researchers had been 'systematically overoptimistic' in the claims they made about machines’ ability to infer political orientation. When standard techniques were tested on the 'normal' population of Twitter users, methods that had reported greater than 90 percent accuracy achieved barely 65 percent."
. . . .
"The emerging problems highlight another challenge: bridging the 'Grand Canyon,' as Mr. Lazer calls it, between 'social scientists who aren’t computationally talented and computer scientists who aren’t social-scientifically talented.' As universities are set up now, he says, 'it would be very weird' for a computer scientist to teach courses to social-science doctoral students, or for a social scientist to teach research methods to information-science students. Both, he says, should be happening."
. . . .
"'We’re witnessing the birth of a new kind of social science,' he says. Current struggles are 'the birthing pains of that process.'"

