The scrutiny of the scientific community has also turned to research involving computer programs, finding that reproducibility depends more strongly on implementation than commonly thought. These problems are especially relevant for property predictions of crystals and molecules, which hinge on precise computer implementations of the governing equation of quantum physics. We devised a procedure to assess the precision of DFT methods and used this to demonstrate reproducibility among many of the most widely used DFT codes.
Benchmarking has proven to be crucial for the investigation of the behavior and performances of a system. However, the choice of relevant benchmarks still remains a challenge. To help the process of comparing and choosing among benchmarks, we propose a solution for automatic benchmark profiling. It computes unified benchmark profiles reflecting benchmarks’ duration, function repartition, stability, CPU efficiency, parallelization and memory usage. It identifies the needed system information for profile computation, collects it from execution traces and produces profiles through efficient and reproducible trace analysis treatments. The paper presents the design, implementation and the evaluation of the approach. The analysis of the kernel trace follows a workflow implemented using the VisTrails tool.
As science grapples with what some have called a reproducibility crisis, replication studies, which aim to reproduce the results of previous studies, have been held up as a way to make science more reliable. It seems like common sense: Take a study and do it again — if you get the same result, that’s evidence that the findings are true, and if the result doesn’t turn up again, they’re false. Yet in practice, it’s nowhere near this simple.
Once again, reproducibility is in the news. Most recently we hear that irreproducibility is irreproducible and thus everything is actually fine. The most recent round was kicked off by a criticism of the Reproducibility Project followed by claim and counter claim on whether one analysis makes more sense than the other. I’m not going to comment on that but I want to tease apart what the disagreement is about, because it shows that the problem with reproducibility goes much deeper than whether or not a particular experiment replicates.
When reporting research findings, scientists document the steps they followed so that others can verify and build upon the research. When those steps have been described in sufficient detail that others can retrace the steps and obtain similar results, the research is said to be reproducible. Computers play a vital role in many research disciplines and present both opportunities and challenges for reproducibility. With a broad scientific audience in mind, we describe strengths and limitations of each approach, as well as circumstances under which each might be applied. No single strategy is sufficient for every scenario; thus we emphasize that it is often useful to combine approaches.
The scientific community is bustling with projects to make published results more reliable. Efforts are under way to establish checklists, to revamp training in experimental design, and even to fund disinterested scientists to replicate others' experiments. A more efficient strategy would be to rework current incentives to put less emphasis on high-impact publications, but those systems are entrenched, and public funders and universities are ill-prepared for that scale of change. To catalyse change, industry must step up to the plate. I have learned this first hand, as head of the Structural Genomics Consortium (SGC), a research charity funded by business, government and other charities. If more companies contributed funds and expertise to efforts such as ours, I believe it would create a system that rewards science that is both cutting-edge and reproducible.