It’s not a new story, although "the reproducibility crisis" may seem to be. For life sciences, I think it started in the late 1950s. Problems caused in clinical research burst into the open in a very public way then. But before we get to that, what is "research reproducibility"? It’s a euphemism for unreliable research or research reporting. Steve Goodman and colleagues (2016) say 3 dimensions of science that affect reliability are at play: Methods reproducibility – enough detail available to enable a study to be repeated; Results reproducibility – the findings are replicated by others; Inferential reproducibility – similar conclusions are drawn about results, which brings statistics and interpretation squarely into the mix. There is a lot of history behind each of those. Here are some of the milestones in awareness and proposed solutions that stick out for me.
The US National Institutes of Health (NIH) is now assessing all research grant submissions based on the rigor and transparency of the proposed research plans. Previously, efforts to strengthen scientific practices had been undertaken by individual institutes, beginning in 2011 with the National Institute on Aging, which partnered with APS and the NIH Office of Behavioral and Social Science Research to begin a conversation about improving reproducibility across science. These early efforts were noted and encouraged by Congress. Now, the entire agency has committed to this important goal: NIH's 2016–2020 strategic plan announces, "NIH will take the lead in promoting new approaches toward enhancing the rigor of experimental design, analysis, and reporting."
Well, over the last two years iGEM teams around the world have been working to find out just how reproducible fluorescent proteins measurements are. They distributed testing plasmids and compared results across labs, measurement instruments, genetic parts, and E. coli strains. It’s a thorough 2 year study of interlab variability, and the results are out in PLOS ONE, “Reproducibility of Fluorescent Expression from Engineered Biological Constructs in E. coli“.
Adhering faithfully to the scientific method is at the very heart of psychological inquiry. It requires scientists to be passionately dispassionate, to be intensely interested in scientific questions but not wedded to the answers. It asks that scientists not personally identify with their past work or theories — even those that bear their names — so that science as a whole can inch ever closer to illuminating elusive truths. That compliance isn’t so easy. But those who champion the so-called replication revolution in psychological science believe that it is possible — with the right structural reforms and personal incentives.
When graduate student Alyssa Ward took a science-policy internship, she expected to learn about policy — not to unearth gaps in her biomedical training. She was compiling a bibliography about the reproducibility of experiments, and one of the papers, a meta-analysis, found that scientists routinely fail to explain how they choose the number of samples to use in a study. "My surprise was not about the omission — it was because I had no clue how, or when, to calculate sample size," Ward says. Nor had she ever been taught about major categories of experimental design, or the limitations of P values. (Although they can help to judge the strength of scientific evidence, P values do not — as many think — estimate the likelihood that a hypothesis is true.)
It seems like the most elementary of research principles: Make sure the cells and reagents in your experiment are what they claim to be and behave as expected. But when it comes to antibodies—the immune proteins used in all kinds of experiments to tag a molecule of interest in a sample—that validation process is not straightforward. Research antibodies from commercial vendors are often screened and optimized for narrow experimental conditions, which means they may not work as advertised for many scientists. Indeed, problems with antibodies are thought to have led many drug developers astray and generated a host of misleading or irreproducible scientific results.