Automatic generation of provenance metadataduring execution of scientific workflows

Data processing in data intensive scientific fields likebioinformatics is automated to a great extent. Among others,automation is achieved with workflow engines that execute anexplicitly stated sequence of computations. Scientists can usethese workflows through science gateways or they develop themby their own. In both cases they may have to preprocess their raw data and also may want to further process the workflowoutput. The scientist has to take care about provenance of thewhole data processing pipeline. This is not a trivial task dueto the diverse set of computational tools and environments usedduring the transformation of raw data to the final results. Thuswe created a metadata schema to provide provenance for dataprocessing pipelines and implemented a tool that creates this metadata during the execution of typical scientific computations.

Empirical examination of the replicability of associations between brain structure and psychological variables

Linking interindividual differences in psychological phenotype to variations in brain structure is an old dream for psychology and a crucial question for cognitive neurosciences. Yet, replicability of the previously-reported ‘structural brain behavior’ (SBB)-associations has been questioned, recently. Here, we conducted an empirical investigation, assessing replicability of SBB among heathy adults. For a wide range of psychological measures, the replicability of associations with gray matter volume was assessed. Our results revealed that among healthy individuals 1) finding an association between performance at standard psychological tests and brain morphology is relatively unlikely 2) significant associations, found using an exploratory approach, have overestimated effect sizes and 3) can hardly be replicated in an independent sample. After considering factors such as sample size and comparing our findings with more replicable SBB-associations in a clinical cohort and replicable associations between brain structure and non-psychological phenotype, we discuss the potential causes and consequences of these findings.

Methodological Reporting Behavior, Sample Sizes, and Statistical Power in Studies of Event- Related Potentials: Barriers to Reproducibility and Replicability

Methodological reporting guidelines for studies of event-related potentials (ERPs) were updated in Psychophysiology in 2014. These guidelines facilitate the communication of key methodological parameters (e.g., preprocessing steps). Failing to report key parameters represents a barrier to replication efforts, and difficultly with replicability increases in the presence of small sample sizes and low statistical power. We assessed whether guidelines are followed and estimated the average sample size and power in recent research. Reporting behavior, sample sizes, and statistical designs were coded for 150 randomly-sampled articles from five high-impact journals that frequently publish ERP studies from 2011 to 2017. An average of 63% of guidelines were reported, and reporting behavior was similar across journals, suggesting that gaps in reporting is a shortcoming of the field rather than any specific journal. Publication of the guidelines paper had no impact on reporting behavior, suggesting that editors and peer reviewers are not enforcing these recommendations. The average sample size per group was 21. Statistical power was conservatively estimated as .72-.98 for a large effect size, .35-.73 for a medium effect, and .10-.18 for a small effect. These findings indicate that failing to report key guidelines is ubiquitous and that ERP studies are only powered to detect large effects. Such low power and insufficient following of reporting guidelines represent substantial barriers to replication efforts. The methodological transparency and replicability of studies can be improved by the open sharing of processing code and experimental tasks and by a priori sample size calculations to ensure adequately powered studies.

A Large-scale Study about Quality and Reproducibility of Jupyter Notebooks

Jupyter Notebooks have been widely adopted by many different communities, both in science and industry. They support the creation of literate programming documents that combine code, text, and execution results with visualizations and all sorts of rich media. The self-documenting aspects andthe ability to reproduce results have been touted as significant benefits of notebooks. At the same time, there has been growing criticism that the way notebooks are being used leads to unexpected behavior, encourage poor coding practices, and that their results can be hard to reproduce. To understand good and bad practices used in the development of real notebooks, we studied 1.4 million notebooks from GitHub. We present a detailed analysis of their characteristics that impact reproducibility. We also propose a set of best practices that can improve the rate of reproducibility and discuss open challenges that require further research and development.

Towards minimum reporting standards for life scientists

Transparency in reporting benefits scientific communication on many levels. While specific needs and expectations vary across fields, the effective use of research findings relies on the availability of core information about research materials, data, and analysis. In December 2017, a working group of journal editors and experts in reproducibility convened to create the “minimum standards” working group. This working group aims to devise a set of minimum expectations that journals could ask their authors to meet, and will draw from the collective experience of journals implementing a range of different approaches designed to enhance reporting and reproducibility (e.g. STAR Methods), existing life science checklists (e.g. the Nature Research reporting summary), and the results of recent meta-research studying the efficacy of such interventions (e.g. Macleod et al. 2017; Han et al. 2017).

An empirical assessment of transparency and reproducibility-related research practices in the social sciences (2014-2017)

Serious concerns about research quality have catalyzed a number of reform initiatives intended to improve transparency and reproducibility and thus facilitate self-correction, increase efficiency, and enhance research credibility. Meta-research has evaluated the merits of individual initiatives; however, this may not capture broader trends reflecting the cumulative contribution of these efforts. In this study, we evaluated a broad range of indicators related to transparency and reproducibility in a random sample of 198 articles published in the social sciences between 2014 and 2017. Few articles indicated availability of materials (15/96, 16% [95% confidence interval, 9% to 23%]), protocols (0/103), raw data (8/103, 8% [2% to 15%]), or analysis scripts (3/103, 3% [1% to 6%]), and no studies were pre-registered (0/103). Some articles explicitly disclosed funding sources (or lack of; 72/179, 40% [33% to 48%]) and some declared no conflicts of interest (32/179, 18% [13% to 24%]). Replication studies were rare (2/103, 2% [0% to 4%]). Few studies were included in evidence synthesis via systematic review (6/96, 6% [3% to 11%]) or meta-analysis (2/96, 2% [0% to 4%]). Slightly less than half the articles were publicly available (95/198, 48% [41% to 55%]). Minimal adoption of transparency and reproducibility-related research practices could be undermining the credibility and efficiency of social science research. The present study establishes a baseline that can be revisited in the future to assess progress.