A primer for how not to respond when someone fails to replicate your work
with a discussion of why replication failures happen

In the linked post, John Bargh responds to a paper published in PLoS ONE that failed to replicate his finding that priming people with terms related to aging led them to walk more slowly to the elevator afterward. His post is a case study of what NOT to do when someone fails to replicate one of your findings.

Replication failures happen. In fact, they should happen some of the time even if the effect is real and the replication attempt was conducted exactly like the original study. For any effect, especially small ones, you would expect some failures to replicate. Failures to replicate could occur for all of the following reasons (and maybe others):

(1) chance -- the effect is real, but this particular test of it didn't find the effect. With small effects, you expect some percentage of exact replication attempts to fail to find effects as big as the original. Remember, measurements of behavior are inherently noisy, and it's rare to find exactly the same effect size every time. In fact, finding exactly the same effect every time is a sign of bias (and sometimes a sign of fraud.)

(2) seemingly arbitrary design differences that contributed to the discrepancy -- these can be informative, helping to constrain the generalizability of the conclusions. They are grounds for further studies.

(3) poor methodology on the part of those trying to replicate the study—it's easy to produce a null result by conducting shoddy research. In this account, the failure to replicate is a false negative due to poor design, not to subtle but reasonable design differences.

(4) poor methodology in the original research—the original finding was a false positive due to poor controls and design. False positives could also result from design or analysis decisions that lead to reporting only of significant findings or variants of a study.

5) Chance, but for the original finding. Some published effects might be false positives, even if they were conducted competently. That's especially true for underpowered studies.

Given the strong bias to publish only positive results (see work by Ionnides, for example), an original false positive seems at least as likely as a false negative, especially when there are few if any direct replications of a published result. Given the difficulty of publishing replication failures, it's important to realize that there might be other failures to replicate that were not published (see the comment from +Alex Holcombe on the post, noting another failure to replicate this particular study).

Rather than dispassionately considering all of these possibilities, including that the original research might be a false positive, Bargh chose to:
1) Dismiss the journal in which the replication failure was published in an uninformed way. Bargh claims that PLoS One is a for profit journal that is effectively a vanity press with a pay-to-publish model that doesn't do a thorough peer review and doesn't rely on expert editors. In reality, PLoS One is a non-profit organization that selects expert editors, reviews papers just like any other journal, and never rejects a paper because authors can't pay (they waive the fee upon request). It is one of the fastest growing open-access journals, and has a roughly 30% rejection rate. It differs from other journals in that it publishes empirically solid work regardless of the perceived theoretical impact.

2) Accuse the authors of the critique of incompetence with an unjustified, ad-hominem attack. This group of authors has extensive expertise in consciousness research. For example, Cleeremans was an editor of the Oxford Companion to Consciousness and is well-respected in the field.

3) Describe method details from the replication attempt (and the original study) inaccurately. See the first comment on the post for a detailed discussion.

4) Reveal a lack of familiarity with science blogging by blaming one of the most careful and thoughtful science writers working today (+Ed Yong) for publicizing the replication failure. It seems somewhat disingenuous to fault Ed for "swallowing their conclusions whole" after refusing to respond to a request for comment he sent several days before the post went live.

An effective response to a failure to replicate would be to identify ways in which the studies differed and then to test whether those differences explain the discrepancy. Acknowledging that replication failures happen and pushing for more direct replication attempts rather than just conceptual ones might help too. But, assuming that a failure to replicate must have been due to incompetence or the shoddy standards of a journal is a pretty brash response.

The comments on Bargh's post (from +Ed Yong, Neuroskeptic, +Peter Binfield, +Alex Holcombe, and many others) are interesting and informative, an example of how the science bloggers work to correct the record.
Shared publiclyView activity