Posts about reproducible paper (old posts, page 21)

Opportunities for increased reproducibility and replicability of developmental cognitive neuroscience

Recently, many workflows and tools that aim to increase the reproducibility and replicability of research findings have been suggested. In this review, we discuss the opportunities that these efforts offer for the field of developmental cognitive neuroscience. We focus on issues broadly related to statistical power and to flexibility and transparency in data analyses. Critical considerations relating to statistical power include challenges in recruitment and testing of young populations, how to increase the value of studies with small samples, and the opportunities and challenges related to working with large-scale datasets. Developmental studies also involve challenges such as choices about age groupings, modelling across the lifespan, the analyses of longitudinal changes, and neuroimaging data that can be processed and analyzed in a multitude of ways. Flexibility in data acquisition, analyses and description may thereby greatly impact results. We discuss methods for improving transparency in developmental cognitive neuroscience, and how preregistration of studies can improve methodological rigor in the field. While outlining challenges and issues that may arise before, during, and after data collection, solutions and resources are highlighted aiding to overcome some of these. Since the number of useful tools and techniques is ever-growing, we highlight the fact that many practices can be implemented stepwise.

Analysis validation has been neglected in the Age of Reproducibility

Increasingly complex statistical models are being used for the analysis of biological data. Recent commentary has focused on the ability to compute the same outcome for a given dataset (reproducibility). We argue that a reproducible statistical analysis is not necessarily valid because of unique patterns of nonindependence in every biological dataset. We advocate that analyses should be evaluated with known-truth simulations that capture biological reality, a process we call “analysis validation.” We review the process of validation and suggest criteria that a validation project should meet. We find that different fields of science have historically failed to meet all criteria, and we suggest ways to implement meaningful validation in training and practice.

Semantic workflows for benchmark challenges:Enhancing comparability, reusability and reproducibility

WINGS enables researchers to submit complete semantic workflows as challenge submissions. By submitting entries as workflows, it then becomes possible to compare not just the results and performance of a challenger, but also the methodology employed. This is particularly important when dozens of challenge entries may use nearly identical tools, but with only subtle changes in parameters (and radical differences in results). WINGS uses a component driven workflow design and offers intelligent parameter and data selectionby reasoning aboutdata characteristics.

Improving Quality, Reproducibility, and Usability of FRET‐Based Tension Sensors

Mechanobiology, the study of how mechanical forces affect cellular behavior, is an emerging field of study that has garnered broad and significant interest. Researchers are currently seeking to better understand how mechanical signals are transmitted, detected, and integrated at a subcellular level. One tool for addressing these questions is a Förster resonance energy transfer (FRET)‐based tension sensor, which enables the measurement of molecular‐scale forces across proteins based on changes in emitted light. However, the reliability and reproducibility of measurements made with these sensors has not been thoroughly examined. To address these concerns, we developed numerical methods that improve the accuracy of measurements made using sensitized emission‐based imaging. To establish that FRET‐based tension sensors are versatile tools that provide consistent measurements, we used these methods, and demonstrated that a vinculin tension sensor is unperturbed by cell fixation, permeabilization, and immunolabeling. This suggests FRET‐based tension sensors could be coupled with a variety of immuno‐fluorescent labeling techniques. Additionally, as tension sensors are frequently employed in complex biological samples where large experimental repeats may be challenging, we examined how sample size affects the uncertainty of FRET measurements. In total, this work establishes guidelines to improve FRET‐based tension sensor measurements, validate novel implementations of these sensors, and ensure that results are precise and reproducible.

Data Pallets: Containerizing Storage For Reproducibility and Traceability

Trusting simulation output is crucial for Sandia's mission objectives. We rely on these simulations to perform our high-consequence mission tasks given national treaty obligations. Other science and modeling applications, while they may have high-consequence results, still require the strongest levels of trust to enable using the result as the foundation for both practical applications and future research. To this end, the computing community has developed workflow and provenance systems to aid in both automating simulation and modeling execution as well as determining exactly how was some output was created so that conclusions can be drawn from the data. Current approaches for workflows and provenance systems are all at the user level and have little to no system level support making them fragile, difficult to use, and incomplete solutions. The introduction of container technology is a first step towards encapsulating and tracking artifacts used in creating data and resulting insights, but their current implementation is focused solely on making it easy to deploy an application in an isolated "sandbox" and maintaining a strictly read-only mode to avoid any potential changes to the application. All storage activities are still using the system-level shared storage. This project explores extending the container concept to include storage as a new container type we call \emph{data pallets}. Data Pallets are potentially writeable, auto generated by the system based on IO activities, and usable as a way to link the contained data back to the application and input deck used to create it.

A Model-Centric Analysis of Openness, Replication, and Reproducibility

The literature on the reproducibility crisis presents several putative causes for the proliferation of irreproducible results, including HARKing, p-hacking and publication bias. Without a theory of reproducibility, however, it is difficult to determine whether these putative causes can explain most irreproducible results. Drawing from an historically informed conception of science that is open and collaborative, we identify the components of an idealized experiment and analyze these components as a precursor to develop such a theory. Openness, we suggest, has long been intuitively proposed as a solution to irreproducibility. However, this intuition has not been validated in a theoretical framework. Our concern is that the under-theorizing of these concepts can lead to flawed inferences about the (in)validity of experimental results or integrity of individual scientists. We use probabilistic arguments and examine how openness of experimental components relates to reproducibility of results. We show that there are some impediments to obtaining reproducible results that precede many of the causes often cited in literature on the reproducibility crisis. For example, even if erroneous practices such as HARKing, p-hacking, and publication bias were absent at the individual and system level, reproducibility may still not be guaranteed.