Two weeks ago (1st-4th of August 2016) we hosted a coding sprint at Stanford aimed at making neuroimaging data processing and analysis tools more portable and accessible. You might have heard about BIDS – it is a new standard for organizing and describing neuroimaging datasets that we have recently proposed. Containers (also known as “operating-system-level virtualization”) are very lightweight virtual machines that can encapsulate any piece of code along with all of the libraries necessary to run it. Docker and Singularity are two examples of container technologies. The reason we are so excited about containers for reproducible data analysis is that they provide a way to package a piece of software which can run in the same way across many different computing platforms, from a laptop to a supercomputer. Creating containerized and BIDS-aware versions of all of the major neuroimaging analysis packages is critical to our center’s mission: providing data analysis as an free and open service to incentivize researchers to share data.
Kyle Cranmer, a faculty member in NYU's physics department, distills and describes of the event of NYU's first Reproducibility Symposium on May 3, 2016.
This Workshop aims at becoming a forum to discuss ideas and advancements towards the revision of current scientific communication practices in order to support Open Science, introduce novel evaluation schemes, and enable reproducibility. As such it candidates as an event fostering collaboration between (i) Library and information scientists working on the identification of new publication paradigms; (ii) ICT scientists involved in the definition of new technical solutions to these issues; (iii) scientists/researchers who actually conduct the research and demand tools and practices for Open Science. The expected results are advancements in the definition of the next generation scientific communication ecosystem, where scientists can publish research results (including the scientific article, the data, the methods, and any “alternative” product that may be relevant to the conducted research) in order to enable reproducibility (effective reuse and decrease of cost of science) and rely on novel scientific reward practices.
A pre-conference event of the American Library Association's annual conference: "The credibility of scientific findings is under attack. While this crisis has several causes, none is more common or correctable than the inability to replicate experimental and computational research. This preconference will feature scholars, librarians, and technologists who are attacking this problem through tools and techniques to manage data, enable research transparency, and promote reproducible science. Attendees will learn strategies for fostering and supporting transparent research practices at their institutions."
The report introduces software sustainability, provides definitions, clearly demonstrates that software is not the same as data and illustrates aspects of sustainability in the software lifecycle. The recommendations state that improving software sustainability requires a number of changes: some technical and others societal, some small and others significant. We must start by raising awareness of researchers' reliance on software. This goal will become easier if we recognise the valuable contribution that software makes to research and reward those people who invest their time into developing reliable and reproducible software.
The workshop summarized in this report was designed not to address the social and experimental challenges but instead to focus on the latter issues of improper data management and analysis, inadequate statistical expertise, incomplete data, and difficulties applying sound statistical inference to the available data.