When a range of top political science journals signed a statement to enforce transparency in 2014 (JETS statement), there was an immediate backlash by qualitative researchers. Hundreds of scholars signed a petition against strict transparency rules asking for clarification. Then the LaCour scandal happened, where a political scientist fabricated a study and pretended to withhold his data because of confidentially. Another wake-up call. Where is the debate in political science now?
In 1995, Harvard professor Gary King proposed replications and research transparency as the gold standard for science. There was interest, but the general uptake was slow. For example, it took eight years until four leading international relations journals pledged to adhere to transparency standards and ask authors to provide data and code (2003). After further discussions and debates, it took nearly ten years for the American Political Science Association to adopt the Data Access and Research Transparency (DA-RT) guidelines in their Ethics Guide (2012). Suddenly everything went quickly. In 2014, 27 Top journal editors adopted the DA-RT guidelines in a public Journal Editors’ Transparency Statement (JETS). A backlash followed in 2015: hundreds of researchers signed a petition to delay the implementation, fearing that this would put qualitative researchers at an disadvantage in the peer-review process. Around the same time, a student published a study in Science that was based on fabricated data; when this was uncovered, the discipline had their own scandal (after Reinhart-Rogoff in Economics; or the Duke scandal in Cancer Research). Now the debate exploded, inspiring renewed discussions about political science teaching, qualitative applications of transparency, replication failures and pre-registration.
Changing beliefs and routines
Elman, Kapiszewski and Lupia (2018) recently reviewed this debate, stating that transparency remains controversial because it is difficult to change traditional beliefs, routines and norms about research practices. Other challenges are concerns from some qualitative researchers; the general costs of transparency; and that journal editors may misinterpret the boundary between shareable and unsharable data. The authors remain positive and applaud that the discipline has been “deeply engaged” in dialogue. While there are many practical solutions to the concerns raised, a key improvement would be that “honest mistakes need to be normalised.”
The last part, in my opinion, speaks to a major challenge – the fear of researchers that sharing their data may lead to the detection of errors and public shaming. I think that if we could adopt a professional and objective approach to failed replications, and discuss exactly why replications fail (often just a minor issue), maybe it wouldn’t be such a taboo.
Laitin and Reich (2017) also look at the discipline’s culture and come to similar conclusions. While uncovering scandals is necessary, we do not want to start ringing alarm bells about each and every `failed’ replication. This could easily lead to ammunition of scholars against other scholars in disciplinary debates. Rather, journals need to accept (thoughtful and reflected) replication studies for publication to normalise the use of replication. Also, to prevent the publication of errors in the first place, peer reviewers should be allowed to see the raw data and code, and journals should run this code to see if it works. Finally, if one wants to avoid `policing’ strategies of things already having gone wrong, the authors emphasise that young researchers need to be educated about transparency. For example, political science departments should have required courses in research ethics, and dissertation advisers should help students to implement a transparent workflow.
The idea that young researchers should learn transparency tools in the classroom is also something I have written about (Janz 2016). I believe that universities should introduce replications as class assignments in methods training or invest in new stand-alone replication workshops to establish a culture of replication and reproducibility. It is also very important that students learn to express criticism of original studies carefully and objectively, in a professional language. A replication attempt can fail at different stages, and there is, of course, not always misconduct or sloppiness behind it. Different results do not necessarily mean that the original article was faulty, and so it is all the more important to make sure that the replicator fully understands the methods and variables of the original study.
My piece was written a few years ago. Today, I would stress that (following Elman, Kapiszewski and Lupia 2018), it is crucial that students understand that honest mistakes are OK, while p-hacking and data fabrication are misconduct. Also, I would really like to establish among students that doing replications is simply a way of doing science – rather than an error hunting exercise.
Druckman, Howat and Mullinix (2018) pick this up and discuss how we can improve the quality of graduate advising. They focus on group settings, especially experimental research groups which do surveys and lab experiments. In lab-style groups (admittedly not many exist in political science yet), collective handling of data, sharing data among students for exploration and cross-checks can improve the overall quality of the work. The main goal is to create benefits for all lab members.
In my opinion, this can also be achieved in non-lab style supervisions and advising of students. For example, why not encourage students to work in teams (even if they’re not in the same lab)? A major problem in many UK political science departments, I think, is also that there is hardly any space for students to meet each other – often a hot-desking room can help.
Tools, Workflows and Practical Guidance
In light of only slowly changing norms and routines, Key (2016) asks how scholars can be motivated to share their replication materials – by looking at journal policies as sticks. Key shows that articles published in journals with a data upload requirement are 24 times more likely to share their materials. Unfortunately, the author also finds that many URL’s to personal websites with data are later broken, so that the articles gives an initial impression of transparency when in fact the data is lost.
Gertler and Bullock (2017) also look at URLs that supposedly lead to data and replication code. They find that more than one-fourth of links published in the APSR in 2013 were broken by the end of 2014. They strongly recommend the use of digital repositories and persistent identifiers.
I must admit, I had thought that we have moved on from the debate as to where data should be uploaded. Data repositories such as the Harvard Dataverse or UK Data Service are obvious choices. The two articles above show that we’re not quite there yet.
But what happens once the replication materials are uploaded (and actually where they say they are)? Stockemer, Koehler and Lenz (2018) examine the replicability of studies published in the three behavioural political science journals which had no binding data-access or replication policy at the time of the analysis (2015). They find that, in 25% of the cases, replication was impossible due to poor organisation of the data and/or code. For those who made data available (not many), the replication confirmed the results in roughly 70% of cases; in 5% of articles, the replication results are fundamentally different. The authors call for better data policies; better upload rates by authors; and higher quality of replication materials.
Alvarez, Key and Núñez (2018) echo this, requesting that any uploaded replication material has to be of good quality. For example, a file explaining the replication materials is crucial; code and scripts are often unclear; authors often fail to show which piece of code produces which column in a table; simulations and randomisation (e.g. in multiple imputation) are run without setting the seed; versions of software packages are not clear, and so on.
Another interesting approach is provided by Katsanidou, Horton and Jensen (2016), who do not assess how much data is uploaded, or how good the replication materials are. Rather, they follow the idea that a good methods section (within the paper) should provide transparency and clarity about what was done. E.g. Referring to the data in the text, sample description, question phrasing etc. In an ideal world, we would be able to replicate from the methods section only.
The debate about transparency in qualitative research is ongoing and feels – at the same time – somewhat stuck. The contributions from qualitative researchers have many excellent points. But there is also a tendency to highlight differences in research traditions rather than trying to find commonalities and solutions of creating more credibility for all political science work.
Schwartz-Shea and Yanow (2016) question the legitimacy of the American Political Science Association, which seems to the authors an exclusive club that coerces qualitative researchers to join the transparency hype with bad consequences. DA-RT “seems, to us, to reinvent a wheel that — from the perspective of interpretive research, at least — was not broken. It requires a huge investment of time and energy that therefore seems unwarranted.”
Yom (2018) states that analytic transparency is easier to implement in the biomedical or natural sciences than in political research because open-ended or inductive practices cannot be made transparent in the same way. Monroe (2018) repeats well-known issues such as space constraints in journals which don’t allow lengthy discussions of qualitative data; ethical concerns about human subject protection; and costs of data collection and right of first usage.
Luckily much has been done to provide solutions to these problems, such as online supplements with more space than articles, tips about anonymisation of data, embargoes to allow for longer use of the data by the original author. Good sources are the Qualitative Data Repository and the UK Data Service to make transparency and credibility possible for qualitative work.
A helpful distinction is also that data transparency is different from analytical (how the evidence was analysed) and process transparency (how data was collected). Qualitative work may not publish all field notes and interview transcripts, but could do much better by describing data selection and – even more crucially – evidence selection to support claims.
I should note that at conferences and methods festivals, I encounter a very open dialogue about qualitative and quantitative transparency. There are many attempts to show how qualitative data can be shared, and audiences have often been interested in innovative ideas and tools.
Zigerell (2017) argues that research design choices influence estimates and inferences biases. Political science can reduce such bias when researchers commit to a research design before their completing data collection and analysis. Kern and Gleditsch (2017) try to apply the idea of pre-registration and pre-analysis plans to qualitative inference. What does this mean? When using case studies, researchers could make clear if they plan hypothesis-generating or hypothesis-testing; if conducting interviews or focus groups, researchers can share questionnaires; if conducting ethnographical fieldwork, researchers can communicate their target populations and locations before the work is done. Kern and Gleditsch also provide a pre-registration template that I think will also be very useful not only for qualitative research, but also for graduate students and teaching research design.
So, where is the debate in political science now? I feel that transparency has become more of an accepted norm – now the discussion has moved on to implementation, monitoring, tools, and innovative ideas to make it work. This goes for qualitative and quantitative traditions, although there are still many open questions for qualitative research. It seems most scholars would agree that education and training are key, maybe even more so than sticks and carrots, to implement the new transparency norms in the future.