Reproducible Research in Biomedical Science – We’re not there yet

Screen Shot 2016-01-14 at 11.16.38A new PLoS Biol aper on reproducible research practices across the biomedical literature examines if authors provide all data, code and funding information. The results are devastating.

Following a “new spirit of openness“, many biomedical journals now require or enourage that authors provide detailed information on datasets, code, and disclose funding and conflicts of interest.

But the reality is Screen Shot 2016-01-14 at 11.16.09still bleak. A researcher team around Shareen A. Iqbal and John P. A. Ioannidis recently examined how authors follow such transparency guidelines in the biomedical sciences. They took a random sample of 441 articles published between 2000 and 2014 and looked at degrees of openness.

Their main finding is: “We identify an ongoing lack of access to full datasets and detailed protocols for both clinical and non-clinical biomedical investigation.”

Among the 441 articles, only one study provided a full protocol. None of the authors made all raw data directly available to other researchers. About half of the articles did not give any information on funding. Some papers provided links to supplementary data, but the URL did not work.

These were the assessment indicators used by Iqbal et al.:

This research provides additional evidence that it is not enough to encourage authors to work transparently. And in other fields there is similar evidence for low transparency compliance rates of authors, e.g. in Psychology (Wicherts et al, 2006; Wicherts, Bakker and Molenaar, 2011), Economics (Krawczyk and Reuben, 2012) and Medicine (Savage and Vickers, 2009).

So why do authors withhold data?

A while ago, Tenopir et al. 2011 surveyed over 1300 scientists from different disciplines. Nearly one third of the respondents declined to answer whether they publish their data. So the secrecy already begins when you dare to ask someone about their transparency workflow.

Of those who responded, nearly half said that they do not provide their data electronically to others. But why?

  1. many cited insufficient time and a lack of funding (Tenopir et al. 2011)
  2. some authors want to protect their data to use them for future publications (Savage and Vickers 2009; Campbell 2002, 2003)
  3. authors might fear a damaged reputation when a replication of their work fails – which is why they do not share their protocols and data (Lupia and Elman 2014; Carsey 2014)

What can be done?

A good solution are initiatives such as the new APSA ethics guidelines for more transparency in political science, or the Transparency and Openness Promotion (TOP) guidelines for journals, and – most importantly – that journals publish a clear replication policy, and pre-check that the actual availability provision of all data, code and funding information online.

We also need to promote why it is beneficial for authors to work reproducibly, such as building your reputation and helping to prevent disaster (when you forget where your files are).

The result by Iqbal et al. (2016) that none of the authors in their sample made all raw data directly available should be another wake-up call for all natural and social sciences.

