Methodological reporting guidelines for studies of event-related potentials (ERPs) were updated in Psychophysiology in 2014. These guidelines facilitate the communication of key methodological parameters (e.g., preprocessing steps). Failing to report key parameters represents a barrier to replication efforts, and difficultly with replicability increases in the presence of small sample sizes and low statistical power. We assessed whether guidelines are followed and estimated the average sample size and power in recent research. Reporting behavior, sample sizes, and statistical designs were coded for 150 randomly-sampled articles from five high-impact journals that frequently publish ERP studies from 2011 to 2017. An average of 63% of guidelines were reported, and reporting behavior was similar across journals, suggesting that gaps in reporting is a shortcoming of the field rather than any specific journal. Publication of the guidelines paper had no impact on reporting behavior, suggesting that editors and peer reviewers are not enforcing these recommendations. The average sample size per group was 21. Statistical power was conservatively estimated as .72-.98 for a large effect size, .35-.73 for a medium effect, and .10-.18 for a small effect. These findings indicate that failing to report key guidelines is ubiquitous and that ERP studies are only powered to detect large effects. Such low power and insufficient following of reporting guidelines represent substantial barriers to replication efforts. The methodological transparency and replicability of studies can be improved by the open sharing of processing code and experimental tasks and by a priori sample size calculations to ensure adequately powered studies.
Jupyter Notebooks have been widely adopted by many different communities, both in science and industry. They support the creation of literate programming documents that combine code, text, and execution results with visualizations and all sorts of rich media. The self-documenting aspects andthe ability to reproduce results have been touted as significant benefits of notebooks. At the same time, there has been growing criticism that the way notebooks are being used leads to unexpected behavior, encourage poor coding practices, and that their results can be hard to reproduce. To understand good and bad practices used in the development of real notebooks, we studied 1.4 million notebooks from GitHub. We present a detailed analysis of their characteristics that impact reproducibility. We also propose a set of best practices that can improve the rate of reproducibility and discuss open challenges that require further research and development.
Transparency in reporting benefits scientific communication on many levels. While specific needs and expectations vary across fields, the effective use of research findings relies on the availability of core information about research materials, data, and analysis. In December 2017, a working group of journal editors and experts in reproducibility convened to create the “minimum standards” working group. This working group aims to devise a set of minimum expectations that journals could ask their authors to meet, and will draw from the collective experience of journals implementing a range of different approaches designed to enhance reporting and reproducibility (e.g. STAR Methods), existing life science checklists (e.g. the Nature Research reporting summary), and the results of recent meta-research studying the efficacy of such interventions (e.g. Macleod et al. 2017; Han et al. 2017).
Serious concerns about research quality have catalyzed a number of reform initiatives intended to improve transparency and reproducibility and thus facilitate self-correction, increase efficiency, and enhance research credibility. Meta-research has evaluated the merits of individual initiatives; however, this may not capture broader trends reflecting the cumulative contribution of these efforts. In this study, we evaluated a broad range of indicators related to transparency and reproducibility in a random sample of 198 articles published in the social sciences between 2014 and 2017. Few articles indicated availability of materials (15/96, 16% [95% confidence interval, 9% to 23%]), protocols (0/103), raw data (8/103, 8% [2% to 15%]), or analysis scripts (3/103, 3% [1% to 6%]), and no studies were pre-registered (0/103). Some articles explicitly disclosed funding sources (or lack of; 72/179, 40% [33% to 48%]) and some declared no conflicts of interest (32/179, 18% [13% to 24%]). Replication studies were rare (2/103, 2% [0% to 4%]). Few studies were included in evidence synthesis via systematic review (6/96, 6% [3% to 11%]) or meta-analysis (2/96, 2% [0% to 4%]). Slightly less than half the articles were publicly available (95/198, 48% [41% to 55%]). Minimal adoption of transparency and reproducibility-related research practices could be undermining the credibility and efficiency of social science research. The present study establishes a baseline that can be revisited in the future to assess progress.
Research that links brain structure with behavior needs more data, better analyses, and more intelligent approaches.
In 2004, a landmark study showed that an inexpensive medication to treat parasitic worms could improve health and school attendance for millions of children in many developing countries. Eleven years later, a headline in the Guardian reported that this treatment, deworming, had been "debunked."The pronouncement followed an effort to replicate and re-analyze the original study, as well as an update to a systematic review of the effects of deworming. This story made waves amidst discussion of a reproducibility crisis in some of the social sciences. This paper explores what it means to"replicate"and"reanalyze"a study, both in general and in the specific case of deworming. The paper reviews the broader replication efforts in economics, then examines the key findings of the original deworming paper in light of the "replication," "reanalysis," and "systematic review."The paper also discusses the nature of the link between this single paper's findings, other papers' findings, and any policy recommendations about deworming. This example provides a perspective on the ways replication and reanalysis work, the strengths and weaknesses of systematic reviews, and whether there is, in fact, a reproducibility crisis in economics.