Quality standards in the sciences have recently been heavily criticised in the academic community and the mass media. Scandals involving fraud, errors or misconduct have stirred a debate on reproducibility that calls for fundamental changes in the way research is done. As a new teaching course at Cambridge shows, the best way to bring about change is to start in the classroom. An earlier version of this article was published at the University of Cambridge webpage.
Reproducibility is held as the gold standard for scientific research. The legitimacy of any published work depends on the question: can we replicate the analysis and come to the same results? However, scientific practice is far from this ideal. A recent study of reproducibility in political science found that only 18 of 120 journals have a replication policy which requires the authors to upload their datasets so that others can check the results. In economics, an analysis of nearly 500 webpages of scholars showed that the vast majority does not give access to their data and software code on their site. (…)
The debate has prompted new initiatives. For example, the American Political Science Association has revised their guidelines for ethical political science research and recommends all journals to require more transparency from their authors. However, a real change towards more transparency in research must start much earlier – at the student level.
This is the main goal of the Cambridge Replication Workshop at the Social Sciences Research Methods Centre. In eight weeks, students learn about reproducibility standards and then re-analyse a published paper in their field.
Making doctoral work reproducible
The first part of the course introduces students to reproducibility challenges. They discuss what reproducibility means and learn about current cases of failed research transparency and consequences for the scientific community. They then discuss how to make their own doctoral work reproducible. For example, they learn which software is best for reproducible research, and how to set up a clear structure of files and folders that contain logs for analysis and data transformations so that they can always track back how they made their research decisions years later. Students also discuss why it is in their own interest to publish their materials in a data repository like the University of Cambridge’s repository DSpace @ Cambridge.
In the practical part of the course, students then pick a recently published article in their field and try to replicate the results. Replication involves downloading the original paper, finding the data and possibly software code, corresponding with the author, and finally publishing their replication study in the workshop’s data archive to make it available to the wider community.
This is when it hurts.
Understand what irreproducibility means
By trying to replicate existing work, students learn first-hand what irreproducibility really means. Students were confronted with the following challenges: (1) data were nowhere to find, (2) the author did not respond to queries for data, (3) the authors did not remember where they stored their files, (4) methods were not clearly described, (5) it was not clear how raw data were transformed, and (6) statistical models remained opaque.
This irreproducibility across all fields led to extreme frustration among students – and it demonstrated consequences of lack of transparency. Even the experienced Teaching Assistants were surprised at the challenges students had to face.
This is not to say that the Replication Workshop is an exercise in frustration.
Feedback: Cannot wait to apply my knowledge
In student feedback, many reported that they learned much more about statistical methods than in any standard statistics course. They also got experience in how authors make decisions about the analysis that never make it into the polished versions of published work. In feedback, one student wrote that the course “taught me so much about how to publish legitimate and correct research. I cannot wait to apply my knowledge from this course to other projects.”
One student will present his results at the International Studies Association Annual Convention this year, while others plan to embed their experience as a pilot study in their PhD. Several students are hoping to publish their replication study as the first article in their academic career.
With the Replication Workshop, Cambridge is one of the first universities to combine practical replication with learning about reproducibility standards for graduate students. Only when more universities nurture a reproducibility and replication culture in their teaching, can we ensure that the gold standard of reliable, credible and valid results is upheld.
The article was originally published (in a slightly different version) at the University of Cambridge webpage.