In recent years, biomedical research has faced increased scrutiny over issues related to reproducibility and quality in scientific findings(1-3). In response to this scrutiny, funding institutions and journals have implemented top-down policies for grant and manuscript review. While a positive step forward, the long-term merit of these policies is questionable given their emphasis on completing a check-list of items instead of a fundamental re-assessment of how scientific investigation is conducted. Moreover, the top-down style of management used to institute these policies can be argued as being ineffective in engaging the scientific workforce to act upon these issues. To meet current and future biomedical needs, new investigative methods that emphasize collective-thinking, teamwork, shared knowledge and cultivate change from the bottom-up are warranted. Here, a perspective on a new approach to biomedical investigation within the individual laboratory that emphasizes collaboration and quality is discussed.
Results from cognitive neuroscience have been cited as evidence in courtrooms around the world, and their admissibility has been a challenge for the legal system. Unfortunately, the recent reproducibility crisis in cognitive neuroscience, showing that the published studies in cognitive neuroscience may not be as trustworthy as expected, has made the situation worse. Here we analysed how the irreproducible results in cognitive neuroscience literature could compromise the standards for admissibility of scientific evidence, and pointed out how the open science movement may help to alleviate these problems. We conclude that open science not only benefits the scientific community but also the legal system, and society in a broad sense. Therefore, we suggest both scientists and practitioners follow open science recommendations and uphold the best available standards in order to serve as good gatekeepers in their own fields. Moreover, scientists and practitioners should collaborate closely to maintain an effective functioning of the entire gatekeeping system of the law.
Reproducibility of modeling is a problem that exists for any machine learning practitioner, whether in industry or academia. The consequences of an irreproducible model can include significant financial costs, lost time, and even loss of personal reputation (if results prove unable to be replicated). This paper will first discuss the problems we have encountered while building a variety of machine learning models, and subsequently describe the framework we built to tackle the problem of model reproducibility. The framework is comprised of four main components (data, feature, scoring, and evaluation layers), which are themselves comprised of well defined transformations. This enables us to not only exactly replicate a model, but also to reuse the transformations across different models. As a result, the platform has dramatically increased the speed of both offline and online experimentation while also ensuring model reproducibility.
An increasing number of studies, surveys, and editorials highlight experimental and computational reproducibility and replication issues that appear to pervade most areas of modern science. This perspective examines some of the multiple and complex causes of what has been called a "reproducibility crisis," which can impact materials, interface/(bio)interphase, and vacuum sciences. Reproducibility issues are not new to science, but they are now appearing in new forms requiring innovative solutions. Drivers include the increasingly multidiscipline, multimethod nature of much advanced science, increased complexity of the problems and systems being addressed, and the large amounts and multiple types of experimental and computational data being collected and analyzed in many studies. Sustained efforts are needed to address the causes of reproducibility problems that can hinder the rate of scientific progress and lower public and political regard for science. The initial efforts of the American Vacuum Society to raise awareness of a new generation of reproducibility challenges and provide tools to help address them serve as examples of mitigating actions that can be undertaken.
Faculty members and graduate students at the University of Minnesota have formed a workshop to hold discussions about reproducibility in research studies. The discussions come during a national movement to replicate research in social science fields, such as psychology. The movement has shown many previous studies are not reliable. After discussions last spring regarding ways the University can address these research practices, the Minnesota Center for Philosophy of Science designed workshops for faculty and students to discuss ways to develop replicable research methods.
The drive for reproducibility in the computational sciences has provoked discussion and effort across a broad range of perspectives: technological, legislative/policy, education, and publishing. Discussion on these topics is not new, but the need to adopt standards for reproducibility of claims made based on computational results is now clear to researchers, publishers and policymakers alike. Many technologies exist to support and promote reproduction of computational results: containerisation tools like Docker, literate programming approaches such as Sweave, knitr, iPython or cloud environments like Amazon Web Services. But these technologies are tied to specific programming languages (e.g. Sweave/knitr to R; iPython to Python) or to platforms (e.g. Docker for 64-bit Linux environments only). To date, no single approach is able to span the broad range of technologies and platforms represented in computational biology and biotechnology. To enable reproducibility across computational biology, we demonstrate an approach and provide a set of tools that is suitable for all computational work and is not tied to a particular programming language or platform. We present published examples from a series of papers in different areas of computational biology, spanning the major languages and technologies in the field (Python/R/MATLAB/Fortran/C/Java). Our approach produces a transparent and flexible process for replication and recomputation of results. Ultimately, its most valuable aspect is the decoupling of methods in computational biology from their implementation. Separating the 'how' (method) of a publication from the 'where' (implementation) promotes genuinely open science and benefits the scientific community as a whole.