Skip to article frontmatterSkip to article content

Sensitive data is information that must be protected against unauthorised access. This could be data that personally identifies someone, information that we would not want to openly share with others, or data that might bring harm to other people or organisms. Protection of this data may therefore be required for legal or ethical reasons.

When measuring how sensitive a data is, different organizations may follow (or be required to follow) certain structure in identifying different levels of sensitive data and information, based on frameworks. In general, the sensitivity levels are broadly classified into three categories:

According to the information guidelines provided by Imperva, data could be classified based on content, context or user-based.

In terms of research the most common reasons data falls under the high sensitivity data are:

Organizations usually share guidances and links on storing sensitive data, controlling access to sensitive data, encrypting sensitive data, anonymising sensitive data, transferring sensitive data, and disposing of sensitive data. Adhering to the safety protocols are highly essential for ensuring data safety and security. You can learn more about this in the section on Data Privacy Strategy.

Reproducible research is often thought to require open data or open workflows, but it is possible to work reproducibly within sensitive data projects to demonstrate research quality and create a transparent research record to enable reproducibility. There are a number of ways to do this that still fulfill the legal and ethical requirements of working with sensitive data such as using advanced version control tools within private repositories or data safe havens throughout the project, and considering carefully how the project can be published to protect the privacy needs of the data but demonstrate research quality. These are versions of inner source working in which open collaboration and open communication practices are used to form and record the research lifecycle creating a transparent record. Therefore each research project needs to find the best solution for their needs.

The following sub-chapters give an overview of the types of sensitive data and metadata that need to be identified in datasets and therefore handled securely in research projects.

Here is a short video introducing sensitive data by ELIXIR-UK.