In 2005, John Ioannidis, a professor of medicine at Stanford University, published a paper, “Why most published research findings are false,” mathematically showing that a huge number of published papers must be incorrect. He also looked at a number of well-regarded medical research findings, and found that, of 34 that had been retested, 41% had been contradicted or found to be significantly exaggerated. Since then, researchers in several scientific areas have consistently struggled to reproduce major results of prominent studies. By some estimates, at least 51%—and as much as 89%—of published papers are based on studies and experiments showing results that cannot be reproduced.
The reproducibility of scientific findings has been called into question. To contribute data about reproducibility in economics, we replicate 18 studies published in the American Economic Review and the Quarterly Journal of Economics in 2011-2014. All replications follow predefined analysis plans publicly posted prior to the replications, and have a statistical power of at least 90% to detect the original effect size at the 5% significance level. We find a significant effect in the same direction as the original study for 11 replications (61%); on average the replicated effect size is 66% of the original. The reproducibility rate varies between 67% and 78% for four additional reproducibility indicators, including a prediction market measure of peer beliefs.
In August 2015, a team of 270 researchers reported the largest ever single-study audit of the scientific literature. Led by Brian Nosek, executive director of the Center for Open Science in Charlottesville, Virginia, the Reproducibility Project attempted to replicate studies in 100 psychology papers. According to one of several measures of reproducibility, just 36% could be confirmed; by another statistical measure, 47% could. Not so fast, says Gilbert. Because of the way the Reproducibility Project was conducted, its results say little about the overall reliability of the psychology papers it tried to validate, he argues. "The number of studies that actually did fail to replicate is about the number you would expect to fail to replicate by chance alone — even if all the original studies had shown true effects."
For the past decade, scientists have been worried about the so-called replication crisis. Enter the Preclinical Reproducibility and Robustness channel. The website launched the first week in February with the goal of publishing the results of replication studies. The journal wants to keep scientists accountable for their work.
Finding a relevant reporting guideline for a study can be very difficult. Here we introduce a pilot experiment starting for some of the BMC-series journals which aims to overcome this issue.
ReproZip was featured in a post on the Library of Congress's digital preservation blog, the Signal. The author, Genevieve Havemeyer-King, writes "ReproZip is a tool being developed at NYU "aimed at simplifying the process of creating reproducible experiments from command-line executions", and could be something to consider as an alternative to many costly web-archiving services for preservation of internet-based projects and applications."