A study of 100 papers from five journals that make use of bioacoustic recordings shows that only a minority (21%) deposit any of the recordings in a repository, supplementary materials section or a personal website. This lack of deposition hinders re-use of the raw data by other researchers, prevents the reproduction of a project's analyses and confirmation of its findings and impedes progress within the broader bioacoustics community. We make some recommendations for researchers interested in depositing their data.
Replicability and reproducibility of computational models has been somewhat understudied by “the replication movement.” In this paper, we draw on methodological studies into the replicability of psychological experiments and on the mechanistic account of explanation to analyze the functions of model replications and model reproductions in computational neuroscience. We contend that model replicability, or independent researchers' ability to obtain the same output using original code and data, and model reproducibility, or independent researchers' ability to recreate a model without original code, serve different functions and fail for different reasons. This means that measures designed to improve model replicability may not enhance (and, in some cases, may actually damage) model reproducibility. We claim that although both are undesirable, low model reproducibility poses more of a threat to long-term scientific progress than low model replicability. In our opinion, low model reproducibility stems mostly from authors' omitting to provide crucial information in scientific papers and we stress that sharing all computer code and data is not a solution. Reports of computational studies should remain selective and include all and only relevant bits of code.
Web document collections such as WT10G, GOV2, and ClueWeb are widely used for text retrieval experiments. Documents in these collections contain a fair amount of non-content-related markup in the form of tags, hyperlinks, and so on. Published articles that use these corpora generally do not provide specific details about how this markup information is handled during indexing. However, this question turns out to be important: Through experiments, we find that including or excluding metadata in the index can produce significantly different results with standard IR models. More importantly, the effect varies across models and collections. For example, metadata filtering is found to be generally beneficial when using BM25, or language modeling with Dirichlet smoothing, but can significantly reduce retrieval effectiveness if language modeling is used with Jelinek-Mercer smoothing. We also observe that, in general, the performance differences become more noticeable as the amount of metadata in the test collections increase. Given this variability, we believe that the details of document preprocessing are significant from the point of view of reproducibility. In a second set of experiments, we also study the effect of preprocessing on query expansion using RM3. In this case, once again, we find that it is generally better to remove markup before using documents for query expansion.
Experimental deception has not been seriously examined in terms of its impact on reproducible science. I demonstrate, using data from the Open Science Collaboration’s Reproducibility Project (2015), that experiments involving deception have a higher probability of not replicating and have smaller effect sizes compared to experiments that do not have deception procedures. This trend is possibly due to missing information about the context and performance of agents in the studies in which the original effects were generated, leading to either compromised internal validity, or an incomplete specification and control of variables in replication studies. Of special interest are the mechanisms by which deceptions are implemented and how these present challenges for the efficient transmission of critical information from experimenter to participant. I rehearse possible frameworks that might form the basis of a future research program on experimental deception and make some recommendations as to how such a program might be initiated.
To determine the reproducibility of psychological meta-analyses, we investigated whether we could reproduce 500 primary study effect sizes drawn from 33 published meta-analyses based on the information given in the meta-analyses, and whether recomputations of primary study effect sizes altered the overall results of the meta-analysis.
We describe a project-based introduction to reproducible and collaborative neuroimaging analysis. Traditional teaching on neuroimaging usually consists of a series of lectures that emphasize the big picture rather than the foundations on which the techniques are based. The lectures are often paired with practical workshops in which students run imaging analyses using the graphical interface of specific neuroimaging software packages. Our experience suggests that this combination leaves the student with a superficial understanding of the underlying ideas, and an informal, inefficient, and inaccurate approach to analysis. To address these problems, we based our course around a substantial open-ended group project. This allowed us to teach: (a) computational tools to ensure computationally reproducible work, such as the Unix command line, structured code, version control, automated testing, and code review and (b) a clear understanding of the statistical techniques used for a basic analysis of a single run in an MR scanner. The emphasis we put on the group project showed the importance of standard computational tools for accuracy, efficiency, and collaboration. The projects were broadly successful in engaging students in working reproducibly on real scientific questions. We propose that a course on this model should be the foundation for future programs in neuroimaging. We believe it will also serve as a model for teaching efficient and reproducible research in other fields of computational science.