Scientific reproducibility is essential for the advancement of science. It allows the results of previous studies to be reproduced, validates their conclusions and develops new contributions based on previous research. Nowadays, more and more authors consider that the ultimate product of academic research is the scientific manuscript, together with all the necessary elements (i.e., code and data) so that others can reproduce the results. However, there are numerous difficulties for some studies to be reproduced easily (i.e., biased results, the pressure to publish, and proprietary data). In this context, we explain our experience in an attempt to improve the reproducibility of a GIScience project. According to our project needs, we evaluated a list of practices, standards and tools that may facilitate open and reproducible research in the geospatial domain, contextualising them on Peng’s reproducibility spectrum. Among these resources, we focused on containerisation technologies and performed a shallow review to reflect on the level of adoption of these technologies in combination with OSGeo software. Finally, containerisation technologies proved to enhance the reproducibility and we used UML diagrams to describe representative work-flows deployed in our GIScience project.
Reproducibility Notes - a new series of articles that will highlight topics related to the production of robust, effective and reproducible science.
Funding agencies increasingly ask applicants to include data and software management plans into proposals. In addition, the author guidelines of scientific journals and conferences more often include a statement on data availability, and some reviewers reject unreproducible submissions. This trend towards open science increases the pressure on authors to provide access to the source code and data underlying the computational results in their scientific papers. Still, publishing reproducible articles is a demanding task and not achieved simply by providing access to code scripts and data files. Consequently, several projects develop solutions to support the publication of executable analyses alongside articles considering the needs of the aforementioned stakeholders. The key contribution of this paper is a review of applications addressing the issue of publishing executable computational research results. We compare the approaches across properties relevant for the involved stakeholders, e.g., provided features and deployment options, and also critically discuss trends and limitations. The review can support publishers to decide which system to integrate into their submission process, editors to recommend tools for researchers, and authors of scientific papers to adhere to reproducibility principles.
Despite the importance of reviews and syntheses in advancing our understanding of the natural world and informing conservation policy, they frequently are not conducted with the same careful methods as primary studies. This discrepancy can lead to controversy over review conclusions because the methods employed to gather evidence supporting the conclusions are not reproducible. To illustrate this problem, we assessed whether the methods of reviews involved in two recent controversies met the common scientific standard of being reported in sufficient detail to be repeated by an independent researcher. We found that none of the reviews were repeatable by this standard. Later stages of the review process, such as quantitative analyses, were generally described well, but the more fundamental, data-gathering stage was not fully described in any of the reviews. To address the irreproducibility of review conclusions, we believe that ecologists and conservation biologists should recognize that literature searches for reviews are a data gathering exercise and apply the same rigorous study design principles and reporting standards that they would use for primary studies.
Current norms for the teaching and mentoring of higher education are rooted in obsolete practices of bygone eras. Improving the transparency and rigor of science is the responsibility of all who engage in it. Ongoing attempts to improve research credibility have, however, neglected an essential aspect of the academic cycle: the training of researchers and consumers of research. Principled teaching and mentoring involve imparting students with an understanding of research findings in light of epistemic uncertainty, and moreover, an appreciation of best practices in the production of knowledge. We introduce a Framework for Open and Reproducible Research Training (FORRT). Its main goal is to provide educators with a pathway towards the incremental adoption of principled teaching and mentoring practices, including open and reproducible research. FORRT will act as an initiative to support instructors, collating existing teaching pedagogies and materials to be reused and adapted for use within new and existing courses. Moreover, FORRT can be used as a tool to benchmark the current level of training students receive across six clusters of open and reproducible research practices: 'reproducibility and replicability knowledge', 'conceptual and statistical knowledge', 'reproducible analyses', 'preregistration', 'open data and materials', and 'replication research'. FORRT will strive to be an advocate for the establishment of principled teaching and mentorship as a fourth pillar of a true scientific utopia.[working document here: https://tinyurl.com/FORRTworkingDOC]
The adoption of reproducibility remains low, despite incentives becoming increasingly common in different domains, conferences, and journals. The truth is, reproducibility is technically difficult to achieve due to the complexities of computational environments. To address these technical challenges, we created ReproZip, an open-source tool that automatically packs research along with all the necessary information to reproduce it, including data files, software, OS version, and environment variables. Everything is then bundled into an rpz file, which users can use to reproduce the work with ReproZip and a suitable unpacker (e.g.: using Vagrant or Docker). The rpz file is general and contains rich metadata: more unpackers can be added as needed, better guaranteeing long-term preservation. However, installing the unpackers can still be burdensome for secondary users of ReproZip bundles. In this paper, we will discuss how ReproZip and our new tool, ReproServer, can be used together to facilitate access to well-preserved, reproducible work. ReproServer is a web application that allows users to upload or provide a link to a ReproZip bundle, and then interact with/reproduce the contents from the comfort of their browser. Users are then provided a persistent link to the unpacked work on ReproServer which they can share with reviewers or colleagues.