Toward Reusable Science with Readable Code and Reproducibility

An essential part of research and scientific communication is researchers' ability to reproduce the results of others. While there have been increasing standards for authors to make data and code available, many of these files are hard to re-execute in practice, leading to a lack of research reproducibility. This poses a major problem for students and researchers in the same field who cannot leverage the previously published findings for study or further inquiry. To address this, we propose an open-source platform named RE3 that helps improve the reproducibility and readability of research projects involving R code. Our platform incorporates assessing code readability with a machine learning model trained on a code readability survey and an automatic containerization service that executes code files and warns users of reproducibility errors. This process helps ensure the reproducibility and readability of projects and therefore fast-track their verification and reuse.

The Effect of Replications on Citation Patterns: Evidence From a Large-Scale Reproducibility Project

Replication of existing research is often referred to as one of the cornerstones of modern science. In this study, I tested whether the publication of independent replication attempts affects the citation patterns of the original studies. Investigating 95 replications conducted in the context of the Reproducibility Project: Psychology, I found little evidence for an adjustment of citation patterns in response to the publication of these independent replication attempts. This finding was robust to the choice of replication criterion, various model specifications, and the composition of the contrast group. I further present some suggestive evidence that shifts in the underlying composition of supporting and disputing citations have likely been small. I conclude with a review of the evidence in favor of the remaining explanations and discuss the potential consequences of these findings for the workings of the scientific process.

Quantifying Reproducibility in NLP and ML

Reproducibility has become an intensely debated topic in NLP and ML over recent years, but no commonly accepted way of assessing reproducibility, let alone quantifying it, has so far emerged. The assumption has been that wider scientific reproducibility terminology and definitions are not applicable to NLP/ML, with the result that many different terms and definitions have been proposed, some diametrically opposed. In this paper, we test this assumption, by taking the standard terminology and definitions from metrology and applying them directly to NLP/ML. We find that we are able to straightforwardly derive a practical framework for assessing reproducibility which has the desirable property of yielding a quantified degree of reproducibility that is comparable across different reproduction studies.

EnosLib: A Library for Experiment-Driven Research in Distributed Computing

Despite the importance of experiment-driven research in the distributed computing community, there has been little progress in helping researchers conduct their experiments. In most cases, they have to achieve tedious and time-consuming development and instrumentation activities to deal with the specifics of testbeds and the system under study. In order to relieve researchers of the burden of those efforts, we have developed ENOSLIB: a Python library that takes into account best experimentation practices and leverages modern toolkits on automatic deployment and configuration systems. ENOSLIB helps researchers not only in the process of developing their experimental artifacts, but also in running them over different infrastructures. To demonstrate the relevance of our library, we discuss three experimental engines built on top of ENOSLIB, and used to conduct empirical studies on complex software stacks between 2016 and 2019 (database systems, communication buses and OpenStack). By introducing ENOSLIB, our goal is to gather academic and industrial actors of our community around a library that aggregates everyday experiment-driven research operations. A library that has been already adopted by open-source projects and members of the scientific community thanks to its ease of use and extension.

A new method for testing reproducibility in systematic reviews was developed, but needs more testing

We have developed an approach to test reproducibility retrospectively while focusing on the whole conduct of an SR instead of single steps of it. We replicated the literature searches and drew a 25% random sample followed by study selection, data extraction, and risk of bias (ROB) assessments performed by two reviewers independently. These results were compared narratively with the original review. We were not able to fully reproduce the original search resulting in minor differences in the number of citations retrieved. The biggest disagreements were found in study selection. The most difficult section to be reproduced was the RoB assessment due to the lack of reporting clear criteria to support the judgement of RoB ratings, although agreement was still found to be satisfactory. Our approach as well as other approaches needs to undergo testing and comparison in the future as the area of testing for reproducibility of SRs is still in its infancy.

Advancing Reproducibility in Environmental Modeling: Integration of Open Repositories, Process Containerizations, and Seamless Workflows

There is growing acknowledgment and awareness of the reproducibility challenge facing computational environmental modeling. To overcome this challenge, data sharing using open, online repositories that meet the FAIR (Findable, Accessible, Interoperable, and Reusable) guiding principles is recognized as a minimum standard to reproduce computational research. Even with these data sharing guidelines and well-documented workflows, it remains challenging to reproduce computational models due to complexities like inconsistent computational environments or difficulties in dealing with large datasets that prevent seamless, end-to-end modeling. Containerization technologies have been put forward as a means for addressing these problems by encapsulating computational environments, yet domain science researchers are often unclear about which containerization approach and technology is best for achieving a given modeling objective. Thus, to meet FAIR principles, researchers need clear guidelines for encapsulating seamless modeling workflows, especially for environmental modeling use cases that require large datasets. Toward these aims, this dissertation presents three studies to address current limitations of reproducibility in environmental modeling. The first study presents a framework for integrating three key components to improve reproducibility within modern computational environmental modeling: 1) online repositories for data and model sharing, 2) computational environments along with containerization technology and Jupyter notebooks for capturing reproducible modeling workflows, and 3) Application Programming Interfaces (APIs) for intuitive programmatic control of simulation models. The second study focuses on approaches for containerizing computational processes and suggests best practices and guidance for which approach is most appropriate to achieve specific modeling objectives when simulating environmental systems. The third study focuses on open and reproducible seamless environmental modeling workflows, especially when creating and sharing interoperable and reusable large-extent spatial datasets as model input. Key research contributions across these three studies are as follows. 1) Integration of online repositories for data and model sharing, computational environments along with containerization technology for capturing software dependencies, and workflows using model APIs and notebooks for model simulations creates a powerful system more open and reproducible environmental modeling. 2) Considering the needs and purposes of research and educational projects, and applying the appropriate containerization approach for each use case, makes computational research more reliable and efficient. 3) Sharing interoperable and reusable large-extent spatial datasets through open data repositories for model input supports seamless environmental modeling where data and processes can be reused across multiple applications. Finally, the methods developed and insights gained in this dissertation not only advance reliable and efficient computational reproducibility in environmental modeling, but also serve as best practices and guidance for achieving reproducibility in engineering practice and other scientific fields that rely on computational modeling.