Sharing and Archiving Data

Motivations For Sharing Data

There are many reasons to share your research data publicly.

  1. To allow the possibility to fully reproduce a scientific study.

  2. To prevent duplicate efforts and speed up scientific progress. Large amounts of research funds and careers of researchers can be wasted by only sharing a small part of research in the form of publications.

  3. To facilitate collaboration and increase the impact and quality of scientific research.

  4. To make results of research openly available as a public good, since research is often publicly funded.

You can read more about why data should be available, and why some data should remain closed, in the Open Data section

Steps To Share Your Data

Step 1: Select what data you want to share

Not all data can be made openly available, due to ethical and commercial concerns (see the {ref}`Open Data section ), and you may decide that some of your intermediate data is too large to share. As such, you first need to decide which data you need to share for others to be able to reproduce your research.

Step 2: Choose a data repository or other sharing platform

Data should be shared in a formal, open, and indexed data repository where possible so that it will be accessible in the long run. Suitable data repositories by subject, content type or location can be found at Re3data.org, and in FAIRsharing where you can also see which standards (metadata and identifier) the repositories implement and which journal/publisher recommend them. If possible use a repository that assigns a DOI, a digital object identifier, to make it easier for others to cite your data. Have a look in the rr-credit to see how to share and cite your data and other outputs.

A few public data repositories are Zenodo, Figshare, 4TU.ResearchData, and Dryad.

Step 4: Upload your data and documentation

In line with the FAIR principles, upload the data in open formats as much as possible and include sufficient documentation and metadata so that someone else can understand your data. It is also essential to think about the file formats in which the information is provided. Data should be presented in structured and standardised formats to support interoperability, traceability, and effective reuse. In many cases, this will include providing data in multiple, standardized formats, so that it can be processed by computers and used by people.

Additional resources on data sharing

See also the ‘How can you make research data accessible?’ blog that contains five steps to make your data more accessible.

Data Availability Statement

Once you made your data available, it is important to ensure that people can find it when they read the associated article. You should cite your dataset directly in the paper in places where it is relevant, and include a citation in your reference list, as well as include a Data Availability Statement at the end of the paper (similar to the acknowledgement section). See below for some examples.

  • The data that support the findings of this study are openly available in [repository name] at http://doi.org/[doi].

  • The data that support the findings will be available in [repository name] at [URL / DOI] following a [6 month] embargo from the date of publication to allow for the commercialisation of research findings.

  • Restrictions apply to the data that support the findings of this study. [Explain nature of restrictions, for example, the data contains information that could compromise the privacy of research participants] Data are available upon reasonable request by contacting [name and contact details] and with permission of [third party name].

You can find more examples on the Data Access statements page from the University of Manchester, or Nature’s Tips for writing a dazzling Data Availability Statement