Preprint: Transparency, Reproducibility, and the Credibility of Economics Research

There is growing interest in enhancing research transparency and reproducibility in economics and other scientific fields. We survey existing work on these topics within economics, and discuss the evidence suggesting that publication bias, inability to replicate, and specification searching remain widespread in the discipline. We next discuss recent progress in this area, including through improved research design, study registration and pre-analysis plans, disclosure standards, and open sharing of data and materials, drawing on experiences in both economics and other social sciences. We discuss areas where consensus is emerging on new practices, as well as approaches that remain controversial, and speculate about the most effective ways to make economics research more credible in the future.

Publishing a reproducible paper

Adolescence is a period of human brain growth and high incidence of mental health disorders. In 2016 the Neuroscience in Psychiatry Network published a study of adolescent brain development which showed that the hubs of the structural connectome are late developing and are found in association cortex (https://doi.org/10.1073/pnas.1601745113). Furthermore these regions are enriched for genes related to schizophrenia. In this presentation Dr Kirstie Whitaker will demonstrate how this work is supported by open data and analysis code, and that the results replicate in two independent cohorts of teenagers. She will encourage Brainhack-Global participants to take steps towards ensuring that their work meets these standards for open and reproducible science in 2017 and beyond.

Reproducible Data Analysis in Jupyter

Jupyter notebooks provide a useful environment for interactive exploration of data. A common question I get, though, is how you can progress from this nonlinear, interactive, trial-and-error style of exploration to a more linear and reproducible analysis based on organized, packaged, and tested code. This series of videos presents a case study in how I personally approach reproducible data analysis within the Jupyter notebook.

HESML: a scalable ontology-based semantic similarity measures library with a set of reproducible experiments and a replication dataset

This work is a detailed companion reproducibility paper of the methods and experiments proposed in three previous works by Lastra-Díaz and García-Serrano, which introduce a set of reproducible experiments on word similarity based on HESML and ReproZip with the aim of exactly reproducing the experimental surveys in the aforementioned works. This work introduces a new representation model for taxonomies called PosetHERep, and a Java software library called Half-Edge Semantic Measures Library (HESML) based on it, which implements most ontology-based semantic similarity measures and Information Content (IC) models based on WordNet reported in the literature.

Reproducibility Data: The SAPA Personality Inventory: An empirically-derived, hierarchically-organized self-report personality assessment model

Unlike most other SAPA datasets available on Dataverse, these data are specifically tied to the reproducible manuscript entitled "The SAPA Personality Inventory: An empirically-derived, hierarchically-organized self-report personality assessment model." Most of these files are images that should be downloaded and organized in the same location as the source .Rnw file. A few files contain data that have already been processed (and could be independently re-created using code in the .Rnw file) - these are included to shorten the processing time needed to reproduce the original document. The raw data files for most of the analyses are stored in 3 separate locations, 1 for each of the 3 samples. These are: Exploratory sample - doi:10.7910/DVN/SD7SVE Replication sample - doi:10.7910/DVN/3LFNJZ Confirmatory sample - doi:10.7910/DVN/I8I3D3 . If you have any questions about reproducing the file, please first consult the instructions in the Preface of the PDF version. Note that the .Rnw version of the file includes many annotations that are not visible in the PDF version (https://sapa-project.org/research/SPI/SPIdevelopment.pdf) and which may also be useful. If you still have questions, feel free to email me directly. Note that it is unlikely that I will be able to help with technical issues that do not relate of R, Knitr, Sweave, and LaTeX.

Reproducibility of biomedical research – The importance of editorial vigilance

Many journal editors are a failing to implement their own authors’ instructions, resulting in the publication of many articles that do not meet basic standards of transparency, employ unsuitable data analysis methods and report overly optimistic conclusions. This problem is particularly acute where quantitative measurements are made and results in the publication of papers that lack scientific rigor and contributes to the concerns with regard to the reproducibility of biomedical research. This hampers research areas such as biomarker identification, as reproducing all but the most striking changes is challenging and translation to patient care rare.