Many journals and funders have policies requiring research transparency before an article is accepted or a project is supported. At the same time, much of the work in the social sciences relies on sensitive data in surveys or interviews that could endanger privacy or the well-being of human subjects. How can scholars working with sensitive data ensure a degree of transparency that still protects privacy?
Let’s do a quick test. Answer the following three questions with yes or no.
- Most qualitative research data cannot be shared due to ethical concerns. Do you agree?
- Sensitive data cannot be shared. Do you agree?
- Personal data is sensitive. Do you agree?
The answer to all three should be ‘no’.
Why are authors not sharing their data?
There are a range of reasons why researchers withhold their data. In a very exceptional situation, it might be because the data are fabricated (LaCour and Green 2014). Much more commonly, it’s because data preparation for access costs time and money (Tenopir et al. 2011). Some researchers rather want to use the data for their own future publications, as some studies have shown (Savage and Vickers 2009; Campbell 2002, 2003). A smaller group also fears scrutiny and a damaged reputation when a replication of their work fails, as Lupia and Elman (2014) and Carsey (2014) have pointed out.
A final and very important reason for withholding data is ethics: private and sensitive data (see Savage and Vickers 2009). In the past, this was kind of a killer argument – as soon as data were ‘confidential’ or ‘sensitive’, researchers often feel that data sharing guidelines did not apply to them.
Simply stating “the data is sensitive” is not enough anymore
However, withholding your sensitive data can be a problem. The National Science Foundation requires that all proposals must include a data management and sharing plan, without which an application cannot be submitted. The Research Councils UK state that publicly funded research data are a public good, produced in the public interest. They must be made openly available for re-use, and this goes for all types of research data.
A top journal, the American Journal of Political Science, also makes clear that its replication policy covers all types of research. Authors are not permitted to “embargo,” or withhold, information that has been used to perform an analysis. They must provide all information that is required to reproduce and evaluate any analytic result in quantitative analyses. The rule also applies to those working with interviews, surveys, field notes and other methods. Any central inferential and interpretive claim in qualitative analyses must be supported by evidence that is made accesssible.
Since the LaCour scandal, simply stating “the data is sensitive” is not enough anymore (LaCour withheld his data claiming they are confidential – it later turned out they were fabricated). The expectation is that you either anonymise or otherwise adjust the data to protect human subjects. You should try to obtain consent for sharing (once anonymised). Or, at least, you justify in detail which parts of the data are not published and why.
The transparency trend makes many (qualitative) researchers nervous
Many researchers see deep dilemmas in balancing principles of research transparency against legal and ethical obligations. For example, studies of political violence, inquiry in authoritarian contexts, research involving vulnerable populations may put human subjects at risk if identities are revealed. A real concern is: will scholars conducting research on sensitive topics be able to publish their work and get funded?
I’ve encountered this anxiety at various conference panels, in online discussions, and in my workshops on transparency. I learned two points from these discussions:
- There is sometimes a distorted idea about ownership of data. Why should journal editors or funders have authority over sharing data that took years to collect? For example, a researcher said that she missed out on spending time with her kids to conduct interviews in her fieldwork. These are ‘her’ data now.
- Training and guidelines at most universities are still inadequate. Researchers shy away from sharing sensitive data because they don’t know how to do it safely.
So there’s obviously much room for training and clarification, which is currently provided e.g. by the UK Data Service, to address these issues.
What are some simple guidelines on how to share sensitive data?
Part II of this post will appear on this blog soon.