A blog post from Philip B. Stark, Associate Dean of the Division of Mathematical and Physical Sciences, UC Berkeley Professor of Statistics, and winner of one of BITSS’ Leamer-Rosenthal Prizes for Open Social Science. This post discuss the core elements of reproducibility; its principles and practices.
Today the Federation of American Societies for Experimental Biology (FASEB) issued Enhancing Research Reproducibility, a set of recommendations aimed to promote the reproducibility and transparency of biomedical and biological research.
The topics of scientific rigor and data reproducibility have been increasingly covered in the scientific and mainstream media, and are being addressed by publishers, professional organizations, and funding agencies, including NIH. This webinar – the first in a series titled Training Modules to Enhance Data Reproducibility (TMEDR) – will address topics of scientific rigor as they pertain to pre-clinical neuroscience research.
While experiments may be published even in a top scientific journal, other researchers who attempt to repeat the same experiments under the same conditions often find contradicting results. As a measure of this, a recent study attempted to reproduce psychology publications and successfully replicated only 39 out of 100 studies. It turns out that excluding sex in experimental design may have contributed to reproducibility issues. Furthermore, sex can also have a biological impact on our scientific understanding and influence how well early biological studies translate into advances in human medicine.
Stanford Center for Reproducible Neuroscience: A new preprint has been posted to the ArXiv that has very important implications and should be required reading for all fMRI researchers. Anders Eklund, Tom Nichols, and Hans Knutson applied task fMRI analyses to a large number of resting fMRI datasets, in order to identify the empirical corrected “familywise” Type I error rates observed under the null hypothesis for both voxel-wise and cluster-wise inference. What they found is shocking: While voxel-wise error rates were valid, nearly all cluster-based parametric methods (except for FSL’s FLAME 1) have greatly inflated familywise Type I error rates. This inflation was worst for analyses using lower cluster-forming thresholds (e.g. p=0.01) compared to higher thresholds, but even with higher thresholds there was serious inflation. This should be a sobering wake-up call for fMRI researchers, as it suggests that the methods used in a large number of previous publications suffer from exceedingly high false positive rates (sometimes greater than 50%).