Automated Documentation of End-to-End Experiments in Data Science

Reproducibility plays a crucial role in experimentation. However, the modern research ecosystem and the underlying frameworks are constantly evolving and thereby making it extremely difficult to reliably reproduce scientific artifacts such as data, algorithms, trained models and visual-izations. We therefore aim to design a novel system for assisting data scientists with rigorous end-to-end documentation of data-oriented experiments. Capturing data lineage, metadata, andother artifacts helps reproducing and sharing experimental results. We summarize this challenge as automated documentation of data science experiments. We aim at reducing manualoverhead for experimenting researchers, and intend to create a novel approach in dataflow and metadata tracking based on the analysis of the experiment source code. The envisioned system will accelerate the research process in general, andenable capturing fine-grained meta information by deriving a declarative representation of data science experiments.

All models are wrong, some are useful, but are they reproducible? Commentary on Lee et al. (2019)

Lee et al. (2019) make several practical recommendations for replicable, useful cognitive modeling. They also point out that the ultimate test of the usefulness of a cognitive model is its ability to solve practical problems. In this commentary, we argue that for cognitive modeling to reach applied domains, there is a pressing need to improve the standards of transparency and reproducibility in cognitive modelling research. Solution-oriented modeling requires engaging practitioners who understand the relevant domain. We discuss mechanisms by which reproducible research can foster engagement with applied practitioners. Notably, reproducible materials provide a start point for practitioners to experiment with cognitive models and determine whether those models might be suitable for their domain of expertise. This is essential because solving complex problems requires exploring a range of modeling approaches, and there may not time to implement each possible approach from the ground up. We also note the broader benefits to reproducibility within the field.

Modeling Provenance and Understanding Reproducibility for OpenRefine Data Cleaning Workflows

Preparation of data sets for analysis is a critical component of research in many disciplines. Recording the steps taken to clean data sets is equally crucial if such research is to be transparent and results reproducible. OpenRefine is a tool for interactively cleaning data sets via a spreadsheet-like interface and for recording the sequence of operations carried out by the user. OpenRefine uses its operation history to provide an undo/redo capability that enables a user to revisit the state of the data set at any point in the data cleaning process. OpenRefine additionally allows the user to export sequences of recorded operations as recipes that can be applied later to different data sets. Although OpenRefine internally records details about every change made to a data set following data import, exported recipes do not include the initial data import step. Details related to parsing the original data files are not included. Moreover, exported recipes do not include any edits made manually to individual cells. Consequently, neither a single recipe, nor a set of recipes exported by OpenRefine, can in general represent an entire, end-to-end data preparation workflow. Here we report early results from an investigation into how the operation history recorded by OpenRefine can be used to (1) facilitate reproduction of complete, real-world data cleaning workflows; and (2) support queries and visualizations of the provenance of cleaned data sets for easy review.

The importance of standards for sharing of computational models and data

The Target Article by Lee et al. (2019) highlights the ways in which ongoing concerns about research reproducibility extend to model-based approaches in cognitive science. Whereas Lee et al. focus primarily on the importance of research practices to improve model robustness, we propose that the transparent sharing of model specifications, including their inputs and outputs, is also essential to improving the reproducibility of model-based analyses. We outline an ongoing effort (within the context of the Brain Imaging Data Structure community) to develop standards for the sharing of the structure of computational models and their outputs.

A response to O. Arandjelovic's critique of "The reproducibility of research and the misinterpretation of p-values"

The main criticism of my piece in ref (2) seems to be that my calculations rely on testing a point null hypothesis, i.e. the hypothesis that the true effect size is zero. He objects to my contention that the true effect size can be zero, "just give the same pill to both groups", on the grounds that two pills can't be exactly identical. He then says "I understand that this criticism may come across as frivolous semantic pedantry of no practical consequence: of course that the author meant to say 'pills with the same contents' as everybody would have understood". Yes, that is precisely how it comes across to me. I shall try to explain in more detail why I think that this criticism has little substance.