Press question mark to see available shortcut keys

"What’s to know about the credibility of empirical economics?", Ioannidis & Doucouliagos 2013

"In this review, we examine the major parameters that are expected to affect the credibility of empirical economics: sample size, magnitude of pursued effects, number and pre-selection of tested relationships, flexibility and lack of standardization in designs, definitions, outcomes and analyses, financial and other interests and prejudices, and the multiplicity and fragmentation of efforts. We summarize and discuss the empirical evidence on the lack of a robust reproducibility culture in economics and business research, the prevalence of potential publication and other selective reporting biases, and other failures and biases in the market of scientific information. Overall, the credibility of the economics literature is likely to be modest or even low.

Over the years, investigators have identified several problems that affect the credibility of empirical economics research. These include but are not limited to: publication bias (DeLong and Lang, 1992); overstating the level of statistical significance (Caudill and Holcombe, 1999); mistaking statistical significance for economic significance (Ziliak and McClosky, 2004); growing positive-outcome bias (Fanelli, 2012); fraud and dishonesty (Bailey et al., 2001; List et al., 2001); funding and promotion inefficiencies (Oswald, 2007); unsupported claims and false beliefs (Levy and Peart, 2012); editors reluctant to publicize plagiarism (Enders and Hoover, 2004) and potentially distorting refereeing process (Frey, 2003).


2.1 Sample Size
Many empirical economics studies are of relatively modest sample size, although sample sizes are increasing in many subfields. Depending on the unit of observation, sample sizes are sometimes unavoidably limited (e.g. macroeconomic studies with ecological analyses at the country-level). In other subfields where analyses are performed at the level of individuals, households or stocks, sample sizes can be large or even very large.
2.2 Magnitude of Effect Size
Most of the focus in empirical economics has been on statistical significance rather than the practical (economic) significance (Ziliak and McCloskey, 2004). Some effects are large, and many are small. Rather upsettingly, meta-analyses are finding that effect sizes in economics appear to be declining over time. For some economics phenomenon, effects may be tiny or even barely distinguishable from the null hypothesis. For example, some mainstream theories imply that predictive effects for stock market behaviour and stock values may be very small or non-existent.

Randomized experimentation is often considered time-consuming, costly, or, in many cases, even impossible. Thus, most empirical studies are observational and association research, which, by default, is likely to have low to very low credibility. ...Much economics research depends on regression modelling that has a notorious flexibility in model construction (Leamer, 1983).

There are probably far fewer researchers working in economics than in the life or physical sciences. For example, as of 12/26/2012, Microsoft Academic Search lists 512,895 author names under Economics & Business, compared with 6,010,966 in Medicine and 1,847,184 in Physics. ...Even retrospective meta-analysis is far less common in economics than in life sciences. Many economic sub-fields and specific areas of research have very few researchers actively involved, perhaps leading to some information monopolies or inbreeding (Ioannidis, 2012b). However, while collaboration is important to the advancement of science, so is competition. Doucouliagos and Stanley (2013) show that competition between rival economics researchers actually increases the credibility of research by reducing publication bias; the greater are theoretical contests the less distorted is economics research and empirical economic inference.

There is relatively very little replication performed in empirical economics, business, and marketing research (Hubbard and Vetter, 1992; Evaschitzky et al., 2007; Hubbard and Armstrong, 1994; Evanschitzky and Armstrong, 2010).1 Most replication efforts are conceptual rather than strict replications. Moreover, there is a high rate of replications that fail to support original findings, ranging from 20% to 65%, depending on the field and journal (See Hubbard and Vetter, 1992; Evanschitzky and Armstong, 2010).

In their famous study, Dewald et al. (1986) found that errors in empirical papers were common, although without necessarily invalidating the main conclusions of the studies. Most errors are inadvertent or due to suboptimal research practices and deficient quality control, but falsification may also be occasionally involved (Fanelli 2009). Todter (2009) tested Benford’s Law on data from two economics journals and found violations in about one-quarter of the papers, consistent with falsification. Bailey et al. (2001) conduct a survey of the most prolific researchers in accounting and report that 4% of the respondents confessed to research falsification. List et al. (2001) find a similar rate of falsification among economists and they also find that 7–10% of economists surveyed confessed to taking credit for graduate students’ work or giving unjustified co-authorship. According to Fanelli (2009) up to 72% of scientists are thought to adopt questionable research practices (not necessarily unconditional falsification). John et al. (2012) find very high rates of questionable research practices among psychologists.

Maniadis et al. (2012) argue that experimental economics is plagued by inflated initial results and false positives. Most randomized and quasi-experimental studies in econometrics tend to have modest sample sizes. For example, among all studies published in the four issues of Experimental Economics in 2012, sample sizes range from 67 to 1175 subjects with a median of 184. So publication bias and winner’s curse may be an issue in such underpowered settings. Even the largest observed effects may be spurious, as has been shown in the medical sciences (Pereira, Horwitz, and Ioannidis, 2012). Finally, experimental randomized studies still represent a small minority. Hamermesh (2012) finds that in 2011, 8.2% of the studies in three leading economics journals were experimental studies, compared to 0.8% in 1983.5 Quasi-experimental studies and those using instrumental variables may not be as protected from biases as experimental randomized studies.

The economics literature seems to have too many results that confirm the authors’ expectations (Fanelli, 2010; 2012). Selective reporting biases cumulatively create literature where there are just too many nominally significant research findings, a situation that can be probed with an excess significance test (Ioannidis and Trikalinos, 2007). The proportion of studies that reported support for the tested hypothesis in economics was found to be 88%, one of the highest across all sciences (Fanelli, 2010). After controlling for differences between pure and applied disciplines and between papers testing one or several hypotheses, the odds of reporting in favour of the tested hypothesis were five times higher among papers in Economics and Business compared to Space Science (Fanelli, 2010)."
Shared publiclyView activity