Jupyter Notebooks have been widely adopted by many different communities, both in science and industry. They support the creation of literate programming documents that combine code, text, and execution results with visualizations and all sorts of rich media. The self-documenting aspects andthe ability to reproduce results have been touted as significant benefits of notebooks. At the same time, there has been growing criticism that the way notebooks are being used leads to unexpected behavior, encourage poor coding practices, and that their results can be hard to reproduce. To understand good and bad practices used in the development of real notebooks, we studied 1.4 million notebooks from GitHub. We present a detailed analysis of their characteristics that impact reproducibility. We also propose a set of best practices that can improve the rate of reproducibility and discuss open challenges that require further research and development.
Transparency in reporting benefits scientific communication on many levels. While specific needs and expectations vary across fields, the effective use of research findings relies on the availability of core information about research materials, data, and analysis. In December 2017, a working group of journal editors and experts in reproducibility convened to create the “minimum standards” working group. This working group aims to devise a set of minimum expectations that journals could ask their authors to meet, and will draw from the collective experience of journals implementing a range of different approaches designed to enhance reporting and reproducibility (e.g. STAR Methods), existing life science checklists (e.g. the Nature Research reporting summary), and the results of recent meta-research studying the efficacy of such interventions (e.g. Macleod et al. 2017; Han et al. 2017).
Serious concerns about research quality have catalyzed a number of reform initiatives intended to improve transparency and reproducibility and thus facilitate self-correction, increase efficiency, and enhance research credibility. Meta-research has evaluated the merits of individual initiatives; however, this may not capture broader trends reflecting the cumulative contribution of these efforts. In this study, we evaluated a broad range of indicators related to transparency and reproducibility in a random sample of 198 articles published in the social sciences between 2014 and 2017. Few articles indicated availability of materials (15/96, 16% [95% confidence interval, 9% to 23%]), protocols (0/103), raw data (8/103, 8% [2% to 15%]), or analysis scripts (3/103, 3% [1% to 6%]), and no studies were pre-registered (0/103). Some articles explicitly disclosed funding sources (or lack of; 72/179, 40% [33% to 48%]) and some declared no conflicts of interest (32/179, 18% [13% to 24%]). Replication studies were rare (2/103, 2% [0% to 4%]). Few studies were included in evidence synthesis via systematic review (6/96, 6% [3% to 11%]) or meta-analysis (2/96, 2% [0% to 4%]). Slightly less than half the articles were publicly available (95/198, 48% [41% to 55%]). Minimal adoption of transparency and reproducibility-related research practices could be undermining the credibility and efficiency of social science research. The present study establishes a baseline that can be revisited in the future to assess progress.
Research that links brain structure with behavior needs more data, better analyses, and more intelligent approaches.
In 2004, a landmark study showed that an inexpensive medication to treat parasitic worms could improve health and school attendance for millions of children in many developing countries. Eleven years later, a headline in the Guardian reported that this treatment, deworming, had been "debunked."The pronouncement followed an effort to replicate and re-analyze the original study, as well as an update to a systematic review of the effects of deworming. This story made waves amidst discussion of a reproducibility crisis in some of the social sciences. This paper explores what it means to"replicate"and"reanalyze"a study, both in general and in the specific case of deworming. The paper reviews the broader replication efforts in economics, then examines the key findings of the original deworming paper in light of the "replication," "reanalysis," and "systematic review."The paper also discusses the nature of the link between this single paper's findings, other papers' findings, and any policy recommendations about deworming. This example provides a perspective on the ways replication and reanalysis work, the strengths and weaknesses of systematic reviews, and whether there is, in fact, a reproducibility crisis in economics.
By implementing more transparent research practices, authors have the opportunity to stand out and showcase work that is more reproducible, easier to build upon, and more credible. Scientists gain by making work easier to share and maintain within their own laboratories, and the scientific community gains by making underlying data or research materials more available for confirmation or making new discoveries. The following protocol gives authors step‐by‐step instructions for using the free and open source Open Science Framework (OSF) to create a data management plan, preregister their study, use version control, share data and other research materials, or post a preprint for quick and easy dissemination.