In this paper we present a software tool for elicitation and management of process metadata. It follows our previously published design idea of an assistant for researchers that aims at minimizing the additional effort required for producing a sustainable workflow documentation. With the ever-growing number of linguistic resources available, it also becomes increasingly important to provide proper documentation to make them comparable and to allow meaningful evaluations for specific use cases. The often prevailing practice of post hoc documentation of resource generation or research processes bears the risk of information loss. Not only does detailed documentation of a process aid in achieving reproducibility, it also increases usefulness of the documented work for others as a cornerstone of good scientific practice. Time pressure together with the lack of simple documentation methods leads to workflow documentation in practice being an arduous and often neglected task. Our tool ensures a clean documentation for common workflows in natural language processing and digital humanities. Additionally, it can easily be integrated into existing institutional infrastructures.
This article provides recommendations for writing empirical journal articles that enable transparency, reproducibility, clarity, and memorability. Recommendations for transparency include preregistering methods, hypotheses, and analyses; submitting registered reports; distinguishing confirmation from exploration; and showing your warts. Recommendations for reproducibility include documenting methods and results fully and cohesively, by taking advantage of open-science tools, and citing sources responsibly. Recommendations for clarity include writing short paragraphs, composed of short sentences; writing comprehensive abstracts; and seeking feedback from a naive audience. Recommendations for memorability include writing narratively; embracing the hourglass shape of empirical articles; beginning articles with a hook; and synthesizing, rather than Mad Libbing, previous literature.
This article uses the framework of Ioannidis (2005) to organise a discussion of issues related to the ‘reproducibility crisis’. It then goes on to use that framework to evaluate various proposals to fix the problem. Of particular interest is the ‘post‐study probability’, the probability that a reported research finding represents a true relationship. This probability is inherently unknowable. However, a number of insightful results emerge if we are willing to make some conjectures about reasonable parameter values. Among other things, this analysis demonstrates the important role that replication can play in improving the signal value of empirical research.
The reproducibility crisis, that is, the fact that many scientific results are difficult to replicate, pointing to their unreliability or falsehood, is a hot topic in the recent scientific literature, and statistical methodologies, testing procedures and p‐values, in particular, are at the centre of the debate. Assessment of the extent of the problem–the reproducibility rate or the false discovery rate–and the role of contributing factors are still an open problem. Replication experiments, that is, systematic replications of existing results, may offer relevant information on these issues. We propose a statistical model to deal with such information, in particular to estimate the reproducibility rate and the effect of some study characteristics on its reliability. We analyse data from a recent replication experiment in psychology finding a reproducibility rate broadly coherent with other assessments from the same experiment. Our results also confirm the expected role of some contributing factor (unexpectedness of the result and room for bias) while they suggest that the similarity between original study and the replica is not so relevant, thus mitigating some criticism directed to replication experiments.
This presentation to LERU workshop: Nurturing a Culture of Responsible Research in the Era of Open Science considered the issue of the credibility of science being in question in a 'post-truth' world and how reproducibility is adding to the problem. Open Science offers a solution, but it is not easy to implement, particularly by research institutions. The main issues relate to language used in the open space, that solutions look different to different disciplines, that researchers are often feeling "under siege" and that we need to reward good open practice.
Most scientific papers are not reproducible: it is really hard, if not impossible, to understand how results are derived from data, and being able to regenerate them in the future (even by the same researchers). However, traceability and reproducibility of results are indispensable elements of highquality science, and an increasing requirement of many journals and funding sources. Reproducible studies include code able to regenerate results from the original data. This practice not only provides a perfect record of the whole analysis but also reduces the probability of errors and facilitates code reuse, thus accelerating scientific progress. But doing reproducible science also brings many benefits to the individual researcher, including saving time and effort, improved collaborations, and higher quality and impact of final publications. In this article we introduce reproducible science, why it is important, and how we can improve the reproducibility of our work. We introduce principles and tools for data management, analysis, version control, and software management that help us achieve reproducible workflows in the context of ecology.