Personal, sensitive & confidential data

FAIR for personal, sensitive and confidential data

While initially the FAIR principles may not appear to lend themselves to personal, sensitive or confidential data (and the direct sharing of these is generally not advised), there are still many ways that we can make such data and/or the associated research FAIR. This page details some of the points to consider and techniques that can be used.

For general information on making your data FAIR, see the following pages:

Planning your research

During your project

Archiving and sharing

What is personal, sensitive, or confidential data?

There are different types of data that may require extra steps when applying FAIR principles:

Personal data: names, identification numbers, location data or any data that could directly or indirectly identify a person.
Special category data: personal data that falls under the definition of special category data, such as personal data revealing ethnic origin, political opinions, religious beliefs, or genetic, biometric or health data.
Commercial-in-confidence data: including trade secrets, investigations, data protected by intellectual property rights.
Biological data: endangered plant or animal species, where their survival is dependent on the protection of their location.

The last three of these categories are often collated and referred to as 'sensitive data', but this is not a definitive definition.

Policy compliance

For research that involves sensitive data, which can come in a range of forms, it is imperative that all policies and agreements that are in place are followed at all times. However, it is still possible to apply FAIR principles to such data. This is made easier if considered early in the project and incorporated into your data management practice. Policies and agreements that you may have, and that should ideally align with each other, include:

University and external ethics

A research project's University ethics approval, and the broader University Ethics Policy, will specify some elements of what you can and cannot do with your data. However, you may need to consider any additional ethical applications that are required (e.g. from the Health Research Authority) and what they stipulate with regard to your data.

Informed Consent

A fully comprehended and signed consent form can allow the processing, depositing and/or sharing of deidentified/anonymised personal data, although the distinction between the consent to participate in the research and the consent for the uses to which the data is to be put should be clearly made. If you are planning to anonymise the data (see below) to remove any personal data before storage, you should make clear in the participant information sheet that once this process has occurred, it will no longer be possible to remove individuals from the data.

Data Protection Impact Assessment (DPIA)

A DPIA may be required for research projects that may impact the privacy of individuals and/or involve the use of personal data - you can find more information on the governance and management page.

Upstream/Third-party Agreements

You should consider whether there are any agreements in place regarding any third-party data that you have used. You should not assume that you can share data, or derived data, simply because they are available online.

Data Sharing Agreement (DSA)

Where existing third-party data is being used, some data providers will require a signed DSA before allowing access to or transfer of the data. All DSAs should be approved by an authorised signatory in Research Services. Some DSAs will have rigid requirements of both the research team and the processing and storage of the data that must be followed; they may also have a destruction date. If you wish to share your data at the end of a project, it is important that this is discussed with the third party and is potentially represented in any DSAs. Requests should be made through the MyResearch service in MUSE (under RS Agreements System). Guidance from the Contracts and Agreements team within Research Services can be sought using the contact details available here.

Privacy Notice

If you have collected personal data from people, either with or without their consent, you should publish information on this data in the form of a privacy notice, typically prior to the data collection. The type of information that this should include can be found on the Information Commissioner's Office pages, and it should be displayed, normally, until the end of the project. There is also a privacy notice template that can be used available here.

While these policies and agreements should be in place at the start of your project, and in most instances before any data is obtained, they will all have an impact on what you will be able to do with the data at the project's end, and may not permit the sharing of the raw data. Where this creates restrictions, there may be other things you can do to make your data more FAIR, as detailed below.

Can sensitive data be rendered suitable for sharing?

Sensitive data could in some instances be made suitable for sharing by transforming the raw, sensitive data into a more suitable-to-share format, by removing or reducing the amount of sensitive data contained. Approaches to consider are:

Anonymisation

The manipulation of any identifiable data items so as to irreversibly prevent any way of identifying individuals. More detailed guidance on how to anonymise your data can be found on the UK Data Service website and the Data Management Expert Guide.

This means that all personal and sensitive data are removed and there is no way that an individual's identity can be retrieved or inferred from the data that is left, or any other data that is available.

If you choose to anonymise your data, you should be careful to ensure that each participant or record is distinguishable. So, for example, if more than one row relates to a single participant, this should be easy to identify, perhaps using a unique identifier per person. However, any mapping file of participant to unique identifier should be destroyed to ensure that the data is fully anonymised, otherwise the data could still be linked back and identified (see below).

‘Anonymisation of personal research data is the only effective solution to comply with both data protection legislation and the requirements of the Open Research Data Pilot.’ (OpenAIRE Horizon 2020 Fact Sheets)

Pseudonymisation (De-identification)

The replacement of personal identifiers with unique codes so data can no longer be attributed to an individual, but could be re-identified with the use of additional information, such as a mapping file that links the removed identifiable data and the unique code it was replaced by. Importantly, data that has been pseudonymised is still considered personal data.

If you are pseudonymising data, do consider factors such as platform usernames and IP addresses, and possible reidentification within a participant's institution - for example, by quoting an expression frequently used by an individual in speech and/or writing.

Other options

Regardless of the above measures, it may still not be possible to share your data - for example, if participants do not give consent, or if anonymising the dataset to enable sharing would significantly limit its usefulness to others. If this is the case, there are alternative things you can do to make the dataset more FAIR:

Alternatives to open data

The CARE principles

When dealing with sensitive data, you should also be aware of the CARE principles. Created to help give greater autonomy of data to Indigenous groups, they can be applied and considered in regard to all sensitive data. See the Ethnographic data page for more details:

Ethnographic data

Page updated

Report abuse