Structuring supplemental materials in support of reproducibility

Supplements are increasingly important to the scientific record, particularly in genomics. However, they are often underutilized. Optimally, supplements should make results findable, accessible, interoperable, and reusable (i.e., “FAIR”). Moreover, properly off-loading to them the data and detail in a paper could make the main text more readable. We propose a hierarchical organization for supplements, with some parts paralleling and “shadowing” the main text and other elements branching off from it, and we suggest a specific formatting to make this structure explicit. Furthermore, sections of the supplement could be presented in multiple scientific “dialects”, including machine-readable and lay-friendly formats.

Reproducible research in the Python ecosystem: a reality check

In summary, my little experiment has shown that reproducibility of Python scripts requires preserving the original environment, which fortunately is not so difficult over a time span of four years, at least if everything you need is part of the Anaconda distribution. I am not sure I would have had the patience to reinstall everything from source, given an earlier bad experience. The purely computational part of my code was even surprisingly robust under updates in its dependencies. But the plotting code wasn’t, as matplotlib has introduced backwards-incompatible changes in a widely used function. Clearly the matplotlib team prepared this carefully, introducing a deprecation warning before introducing the breaking change. For properly maintained client code, this can probably be dealt with.

AI buzzwords explained: scientific workflows

The reproducibility of scientific experiments is crucial for corroborating, consolidating and reusing new scientific discoveries. However, the constant pressure for publishing results (Fanelli, 2010) has removed reproducibility from the agenda of many researchers: in a recent survey published in Nature (with more than 1500 scientists) over 70% of the participants recognize to have failed to reproduce the work from another colleague at some point in time (Baker, 2016). Analyses from psychology and cancer biology show reproducibility rates below 40% and 10% respectively (Collaboration, 2015) (Begley & Lee, 2012). As a consequence, retractions of publications have occurred in the last years in several disciplines (Marcus & Oransky, 2014) (Rockoff, 2015), and the general public is now skeptical about scientific studies on topics like pesticides, depression drugs or flu pandemics (American, 2010).

The role of the IACUC in ensuring research reproducibility

There is a "village" of people impacting research reproducibility, such as funding panels, the IACUC and its support staff, institutional leaders, investigators, veterinarians, animal facilities, and professional journals. IACUCs can contribute to research reproducibility by ensuring that reviews of animal use requests, program self-assessments and post-approval monitoring programs are sufficiently thorough, the animal model is appropriate for testing the hypothesis, animal care and use is conducted in a manner that is compliant with external and institutional requirements, and extraneous variables are minimized. The persons comprising the village also must have a shared vision that guards against reproducibility problems while simultaneously avoids being viewed as a burden to research. This review analyzes and discusses aspects of the IACUC's "must do" and "can do" activities that impact the ability of a study to be reproduced. We believe that the IACUC, with support from and when working synergistically with other entities in the village, can contribute to minimizing unintended research variables and strengthen research reproducibility.

A very simple, re-executable neuroimaging publication

Reproducible research is a key element of the scientific process. Re-executability of neuroimaging workflows that lead to the conclusions arrived at in the literature has not yet been sufficiently addressed and adopted by the neuroimaging community. In this paper, we document a set of procedures, which include supplemental additions to a manuscript, that unambiguously define the data, workflow, execution environment and results of a neuroimaging analysis, in order to generate a verifiable re-executable publication. Re-executability provides a starting point for examination of the generalizability and reproducibility of a given finding.

Federating heterogeneous datasets to enhance data sharing and experiment reproducibility

Recent studies have demonstrated the difficulties to replicate scientific findings and/or experiments published in past.1 The effects seen in the replicated experiments were smaller than previously reported. Some of the explanations for these findings include the complexity of the experimental design and the pressure on researches to report positive findings. The International Committee of Medical Journal Editors (ICMJE) suggests that every study considered for publication must submit a plan to share the de-identified patient data no later than 6 months after publication. There is a growing demand to enhance the management of clinical data, facilitate data sharing across institutions and also to keep track of the data from previous experiments. The ultimate goal is to assure the reproducibility of experiments in the future. This paper describes Shiny-tooth, a web based application created to improve clinical data acquisition during the clinical trial; data federation of such data as well as morphological data derived from medical images; Currently, this application is being used to store clinical data from an osteoarthritis (OA) study. This work is submitted to the SPIE Biomedical Applications in Molecular, Structural, and Functional Imaging conference.

Reproducibility of computational workflows is automated using continuous analysis

Replication, validation and extension of experiments are crucial for scientific progress. Computational experiments are scriptable and should be easy to reproduce. However, computational analyses are designed and run in a specific computing environment, which may be difficult or impossible to match using written instructions. We report the development of continuous analysis, a workflow that enables reproducible computational analyses. Continuous analysis combines Docker, a container technology akin to virtual machines, with continuous integration, a software development technique, to automatically rerun a computational analysis whenever updates or improvements are made to source code or data. This enables researchers to reproduce results without contacting the study authors. Continuous analysis allows reviewers, editors or readers to verify reproducibility without manually downloading and rerunning code and can provide an audit trail for analyses of data that cannot be shared.

Reproducibility and Practical Adoption of GEOBIA with Open-Source Software in Docker Containers

Geographic Object-Based Image Analysis (GEOBIA) mostly uses proprietary software,but the interest in Free and Open-Source Software (FOSS) for GEOBIA is growing. This interest stems not only from cost savings, but also from benefits concerning reproducibility and collaboration. Technical challenges hamper practical reproducibility, especially when multiple software packages are required to conduct an analysis. In this study, we use containerization to package a GEOBIA workflow in a well-defined FOSS environment. We explore the approach using two software stacks to perform an exemplary analysis detecting destruction of buildings in bi-temporal images of a conflict area. The analysis combines feature extraction techniques with segmentation and object-based analysis to detect changes using automatically-defined local reference values and to distinguish disappeared buildings from non-target structures. The resulting workflow is published as FOSS comprising both the model and data in a ready to use Docker image and a user interface for interaction with the containerized workflow. The presented solution advances GEOBIA in the following aspects: higher transparency of methodology; easier reuse and adaption of workflows; better transferability between operating systems; complete description of the software environment; and easy application of workflows by image analysis experts and non-experts. As a result, it promotes not only the reproducibility of GEOBIA, but also its practical adoption.

Reproducibility in biomarker research and clinical development: a global challenge

According to a recent survey conducted by the journal Nature, a large percentage of scientists agrees we live in times of irreproducibility of research results [1]. They believe that much of what is published just cannot be trusted. While the results of the survey may be biased toward respondents with interest in the area of reproducibility, a concern is recognizable. Goodman et al. discriminate between different aspects of reproducibility and dissect the term into ‘material reproducibility’ (provision of sufficient information to enable repetition of the procedures), ‘results reproducibility’ (obtaining the same results from an independent study; formerly termed ‘replicability’) and ‘inferential reproducibility’ (drawing the same conclusions from separate studies) [2]. The validity of data is threatened by many issues, among others by poor utility of public information, poor protocols and design, lack of standard analytical, clinical practices and knowledge, conflict of interest and other biases, as well as publication strategy.