Dealing with the reproducibility crisis: what can ECRs do about it?

Unless you’ve been living under a rock (no judgment, by the way), I’m sure you’ve heard about the reproducibility crisis in scientific research. In 2016, two posts on this blog covered what the main causes of irreproducibility are and what can be done, and how we can reform scientific publishing to value integrity. To briefly recap, a study published in PLOS biology noted that half of preclinical research is not reproducible. The estimated price tag on this irreproducibility is alarming—a whopping $28 billion. In my opinion, however, the most troubling cost of this crisis is its impact on public trust in science.

Computational Reproducibility at Exascale 2017 (CRE2017)

Reproducibility is an important concern in all areas of computation. As such, computational reproducibility is receiving increasing interest from a variety of parties who are concerned with different aspects of computational reproducibility. Computational reproducibility encompasses several concerns including the sharing of code and data, as well as reproducible numerical results which may depend on operating system, tools, levels of parallelism, and numerical effects. In addition, the publication of reproducible computational results motivates a host of computational reproducibility concerns that arise from the fundamental notion of reproducibility of scientific results that has normally been restricted to experimental science. This workshop combines the Numerical Reproducibility at Exascale Workshops (conducted in 2015 and 2016 at SC) and the panel on Reproducibility held at SC16 (originally a BOF at SC15) to address several different issues in reproducibility that arise when computing at exascale. The workshop will include issues of numerical reproducibility as well as approaches and best practices to sharing and running code.

Estimating the Reproducibility of Experimental Philosophy

For scientific theories grounded in empirical data, replicability is a core principle, for at least two reasons. First, unless we accept to have scientific theories rest on the authority of a small number of researchers, empirical studies should be replicable, in the sense that its methods and procedure should be detailed enough for someone else to conduct the same study. Second, for empirical results to provide a solid foundation for scientific theorizing, they should also be replicable, in the sense that most attempts at replicating the original study that produced them would yield similar results. The XPhi Replicability Project is primarily concerned with replicability in the second sense, that is: the replicability of results. In the past year, several projects have shed doubt on the replicability of key findings in psychology, and most notably social psychology. Because the methods of experimental philosophy have often been close to the ones used in social psychology, it is only natural to wonder to which extent the results experimental philosophers ground their theory are replicable. The aim of the XPhi Replicability Project is precisely to reach a reliable estimate of the replicability of empirical results in experimental philosophy. To this end, several research teams across the world will replicate around 40 studies in experimental philosophy, some among the most cited, others drawn at random. The results of the project will be published in a special issue of the Review of Philosophy and Psychology dedicated to the topic of replicability in cognitive science.

The state of reproducibility in the computational geosciences

Figures are essential outputs of computational geoscientific research, e.g. maps and time series showing the results of spatiotemporal analyses. They also play a key role in open reproducible research, where public access is provided to paper, data, and source code to enable reproduction of the reported results. This scientific ideal is rarely practiced as studies, e.g. in biology have shown. In this article, we report on a series of studies to evaluate open reproducible research in the geosciences from the perspectives of authors and readers. First, we asked geoscientists what they understand by open reproducible research and what hinders its realisation. We found there is disagreement amongst authors, and a lack of openness impedes the adoption by authors and readers alike. However, reproducible research also includes the ability to achieve the same results requiring not only accessible but executable source code. Hence, to further examine the reader’s perspective, we searched for open access papers from the geosciences that have code/data attached (in R) and executed the analysis. We encountered several technical issues while executing the code and found differences between the original and reproduced figures. Based on these findings, we propose guidelines for authors to address these.

Developer Interaction Traces backed by IDE Screen Recordings from Think aloud Sessions

There are two well-known difficulties to test and interpret methodologies for mining developer interaction traces: first, the lack of enough large datasets needed by mining or machine learning approaches to provide reliable results; and second, the lack of "ground truth" or empirical evidence that can be used to triangulate the results, or to verify their accuracy and correctness. Moreover, relying solely on interaction traces limits our ability to take into account contextual factors that can affect the applicability of mining techniques in other contexts, as well hinders our ability to fully understand the mechanics behind observed phenomena. The data presented in this paper attempts to alleviate these challenges by providing 600+ hours of developer interaction traces, from which 26+ hours are backed with video recordings of the IDE screen and developer’s comments. This data set is relevant to researchers interested in investigating program comprehension, and those who are developing techniques for interaction traces analysis and mining.