There are two well-known difficulties to test and interpret methodologies for mining developer interaction traces: first, the lack of enough large datasets needed by mining or machine learning approaches to provide reliable results; and second, the lack of "ground truth" or empirical evidence that can be used to triangulate the results, or to verify their accuracy and correctness. Moreover, relying solely on interaction traces limits our ability to take into account contextual factors that can affect the applicability of mining techniques in other contexts, as well hinders our ability to fully understand the mechanics behind observed phenomena. The data presented in this paper attempts to alleviate these challenges by providing 600+ hours of developer interaction traces, from which 26+ hours are backed with video recordings of the IDE screen and developer’s comments. This data set is relevant to researchers interested in investigating program comprehension, and those who are developing techniques for interaction traces analysis and mining.
The safest solution would be to store copies of every object, ever created during the data analysis. All forks, wrong paths, everything. Along with detailed information which functions with what parameters were used to generate each result. Something like the ultimate TimeMachine or GitHub for R objects. With such detailed information, every analysis would be auditable and replicable. Right now the full tracking of all created objects is not possible without deep changes in the R interpreter. The archivist is the light-weight version of such solution.
Theoretical work on reproducibility of scientific claims has hitherto focused on hypothesis testing as the desired mode of statistical inference. Focusing on hypothesis testing, however, poses a challenge to identify salient properties of the scientific process related to reproducibility, especially for fields that progress by building, comparing, selecting, and re-building models. We build a model-centric meta-scientific framework in which scientific discovery progresses by confirming models proposed in idealized experiments. In a temporal stochastic process of scientific discovery, we define scientists with diverse research strategies who search the true model generating the data. When there is no replication in the system, the structure of scientific discovery is a particularly simple Markov chain. We analyze the effect of diversity of research strategies in the scientific community and the complexity of the true model on the time spent at each model, the mean first time to hit the true model and staying with the true model, and the rate of reproducibility given a true model. Inclusion of replication in the system breaks the Markov property and fundamentally alters the structure of scientific discovery. In this case, we analyze aforementioned properties of scientific discovery by an agent-based model. In our system, the seeming paradox of scientific progress despite irreproducibility persists even in the absence of questionable research practices and incentive structures, as the rate of reproducibility and scientific discovery of the truth are uncorrelated. We explain this seeming paradox by a combination of research strategies in the population and the state of truth. Further, we find that innovation speeds up the discovery of truth by making otherwise inaccessible, possibly true models visible to the scientific population. We also show that epistemic diversity in the scientific population optimizes across a range of desirable properties of scientific discovery.
Access to research data is a critical feature of an efficient, progressive, and ultimately self-correcting scientific ecosystem. But the extent to which in-principle benefits of data sharing are realized in practice is unclear. Crucially, it is largely unknown whether published findings can be reproduced by repeating reported analyses upon shared data ("analytic reproducibility"). To investigate, we conducted an observational evaluation of a mandatory open data policy introduced at the journal Cognition. Interrupted time-series analyses indicated a substantial post-policy increase in data available statements (104/417, 25% pre-policy to 136/174, 78% post-policy), and data that were in-principle reusable (23/104, 22% pre-policy to 85/136, 62%, post-policy). However, for 35 articles with in-principle reusable data, the analytic reproducibility of target outcomes related to key findings was poor: 11 (31%) cases were reproducible without author assistance, 11 (31%) cases were reproducible only with author assistance, and 13 (37%) cases were not fully reproducible despite author assistance. Importantly, original conclusions did not appear to be seriously impacted. Mandatory open data policies can increase the frequency and quality of data sharing. However, suboptimal data curation, unclear analysis specification, and reporting errors can impede analytic reproducibility, undermining the utility of data sharing and the credibility of scientific findings.
We surveyed 807 researchers (494 ecologists and 313 evolutionary biologists) about their use of Questionable Research Practices (QRPs), including cherry picking statistically significant results, p hacking, and hypothesising after the results are known (HARKing). We also asked them to estimate the proportion of their colleagues that use each of these QRPs. Several of the QRPs were prevalent within the ecology and evolution research community. Across the two groups, we found 64% of surveyed researchers reported they had at least once failed to report results because they were not statistically significant (cherry picking); 42% had collected more data after inspecting whether results were statistically significant (a form of p hacking) and 51% had reported an unexpected finding as though it had been hypothesised from the start (HARKing). Such practices have been directly implicated in the low rates of reproducible results uncovered by recent large scale replication studies in psychology and other disciplines. The rates of QRPs found in this study are comparable with the rates seen in psychology, indicating that the reproducibility problems discovered in psychology are also likely to be present in ecology and evolution.
Graduate and undergraduate students involved in research projects that generate or analyze extensive datasets use several software applications for data input and processing subject to guidelines for ensuring data quality and availability. Data management guidelines are based on existing practices of the associated academic or funding institutions and may be automated to minimize human error and maintenance overhead. This paper presents a framework for automating data management processes, and it details the flow of data from generation/acquisition through processing to the output of final reports. It is designed to adapt to changing requirements and limit overhead costs. The paper also presents a representative case study applying the framework to the finite element characterization of the magnetically coupled linear variable reluctance motor. It utilizes modern widely available scripting tools particularly Windows PowerShell® to automate workflows. This task requires generating motor characteristics for several thousands of operating conditions using finite element analysis.