One of the most valuable talks of the day for me was from Fernando Chirigati from New York University. He introduced us to a useful new tool called ReproZip. He made the point that the computational environment is as important as the data itself for the reproducibility of research data. This could include information about libraries used, environment variables and options. You can not expect your depositors to find or document all of the dependencies (or your future users to install them). What ReproZip does is package up all the necessary dependencies along with the data itself. This package can then be archived and re-used in the future. ReproZip can also be used to unpack and re-use the data in the future. I can see a very real use case for this for researchers within our institution.
Reproducible Science Promoting Open Science
Prof. Lorena Barba has just posted a reading list for reproducible research that includes ten key papers to understand reproducibility.
The way science journals present research must be rehabilitated or risk becoming obsolete, causing foreseeable negative consequences to research funding and pro-ductivity. Researchers are dealing with ever- increasing complexities, and as techniques and solutions become more involved, so too does the task of describing them. Unfortunately, simply explaining a technique with text does not always paint a clear enough picture. Scientific publishing has followed essentially the same model since the original scientific journal was published in the mid-seventeenth century. Thanks to advances in technology, we have seen some minor improvements such as the addition of color printing and better dissemination and search functionality through online cataloging. But what has actually changed? In truth, not all that much. Articles are still published as text heavy-tomes with the occasional pho-tograph or chart to demonstrate a point.
Workflow is a well-established means by which to capture scientific methods in an abstract graph of interrelated processing tasks. The reproducibility of scientific workflows is therefore fundamental to reproducible e-Science. However, the ability to record all the required details so as to make a workflow fully reproducible is a long-standing problem that is very difficult to solve. In this paper, we introduce an approach that integrates system description, source control, container management and automatic deployment techniques to facilitate workflow reproducibility. We have developed a framework that leverages this integration to support workflow execution, re-execution and reproducibility in the cloud and in a personal computing environment. We demonstrate the effectiveness of our approach by ex-amining various aspects of repeatability and reproducibility on real scientific workflows. The framework allows workflow andtask images to be captured automatically, which improves not only repeatability but also runtime performance. It also gives workflows portability across different cloud environments. Finally, the framework can also track changes in the development of tasks and workflows to protect them from unintentional failures.
We know now that much health and medical research which is published in peer-reviewed journals is wrong, and consequently much is unable to be replicated.[2-4] This is due in part to poor research practice, biases in publication, and simply a pressure to publish in order to ‘survive’. Cognitive biases that unreasonably wed to our hypotheses and results are to blame. Strongly embedded in our culture of health and medical research is the natural selection of poor science practice driven by the dependence for survival on high rates of publication in academic life. It is a classic form of cultural evolution along Darwinian lines.[6, 7] Do not think that even publications in the most illustrious medical journal are immune from these problems: the COMPare project reveals that more than 85% of large randomised controlled trials deviate seriously from their plan when the trial was registered prior to its start. An average of more than five new outcome measures was secretly added to the publication and a similar number of nominated outcomes were silently omitted. It is hardly far-fetched to propose that this drive to publish is contributing to the growth in the number of papers retracted from the literature for dubious conduct along with the increasing number of cases of research misconduct.
Columbia University and other New York City research institutions, including NYU, are hosting a one-day symposium on December 9, 2016 to showcase a robust discussion of reproducibility and research integrity among leading experts, high-profile journal editors, funders and researchers. This program will reveal the "inside story" of how issues are handled by institutions, journals and federal agencies and offer strategies for responding to challenges in these areas. The stimulating and provacative program is for researchers at all stages of their careers.
Psychology has a replication problem. Since 2010, scientists conducting replications of hundreds of studies have discovered that a dismal amount of published results can be reproduced. This realization by psychologists has come to be known as "replication crisis". For me, this story all started with ego-depletion, and the comics I had drawn about it in 2014. The idea is that your self-control is a resource that can be diminished with use. When you think about all the times you've been slowly worn down by temptation, it seems obvious. When I drew the comics, there had been new research pointing to blood sugar levels as the font of self-control from which we all drew from. It also made sense—people get cranky when they're hungry. We even made up a word for it. We call it being "hangry".
Science progresses by an iterative process whereby discoveries build upon a foundation of established facts and principles. The integrity of the advancement of knowledge depends crucially on the reliability and reproducibility of our published results. Although mistakes and falsification of results have always been an unfortunate part of the process, most viewed scientific research as self-correcting; the incorrect results and conclusions would inevitably be challenged and replaced with more reliable information. But what happens if the process is corrupted by systematic errors brought about by the misapplication of statistics, the use of unreliable reagents and inappropriate cell models, and the pressure to publish in the most selective venues? We may be facing this scenario now in areas of biomedical science in which claims have been made that a majority of the most important work in, for example, cancer biology is not reproducible in the hands of drug companies that would seek to rely on the biomedical literature for opportunities in drug discovery.
The biomedical research sciences are currently facing a challenge highlighted in several recent publications: concerns about the rigor and reproducibility of studies published in the scientific literature.Research progress is strongly dependent on published work. Basic science researchers build on their own prior work and the published findings of other researchers. This work becomes the foundation for preclinical and clinical research aimed at developing innovative new diagnostic tools and disease therapies. At each of the stages of research, scientific rigor and reproducibility are critical, and the financial and ethical stakes rise as drug development research moves through these stages.
Adhering faithfully to the scientific method is at the very heart of psychological inquiry. It requires scientists to be passionately dispassionate, to be intensely interested in scientific questions but not wedded to the answers. It asks that scientists not personally identify with their past work or theories — even those that bear their names — so that science as a whole can inch ever closer to illuminating elusive truths. That compliance isn’t so easy. But those who champion the so-called replication revolution in psychological science believe that it is possible — with the right structural reforms and personal incentives.