Skip to article frontmatterSkip to article content

A repository is a place where digital objects can be stored and shared with others (see also this repository definition).

Data repositories provide access to academic outputs that are reliably accessible to any web user (see the OpenDOAR inclusion criteria). Repositories must earn the trust of the communities they intend to serve and demonstrate that they are reliable and capable of appropriately managing the data they hold (Lin et al., 2020).

A tree representing a general data repository, with squirrels symbolizing researchers gathering FAIR data, which can be open or restricted. Next to the tree are examples showing how different academic disciplines and institutions have unique types of data repositories, and how FAIR data may differ when obtained from general or domain-specific repositories.

Figure 1:The Turing Way project illustration by Scriberia. Used under a CC-BY 4.0 licence. DOI: The Turing Way Community & Scriberia (2024).

Long-term archiving repositories are designed for secure and permanent storage of data, ensuring data preservation over extended periods. This differs from platforms like GitHub and GitLab which primarily serve as collaborative development tools, facilitating version control and project management in a more dynamic and transient environment. Platforms such as GitHub and GitLab do not assign persistent identifiers to repositories, and their preservation policies are more flexible compared to those of data repositories.

This chapter includes:

Repositories and FAIR

Selecting an appropriate repository for your research outputs has many benefits:

Why not the supplemental materials?

Supplemental materials are not following the FAIR principles - as there is no separate DOI assigned to the supplemental materials which makes it difficult to retrieve these materials. Next to supplemental materials not being aligned with the FAIR principles, there are other reasons why a data repository is a better solution:

Selecting an appropriate repository

This chapter outlines some of the crucial functionalities that you should look out for when picking where to share your data, code, methods, hardware, slides, or any other Research Object.

Data should be submitted to domain or discipline specific, community recognised, repository where possible. A general purpose repository can be used when there are no suitable discipline specific repositories. Discipline specific data repositories are likely to have more functionalities for the type of data that you would like to share, as well as community standards that you can adhere to make the data more FAIR (Findable, Accessible, Interoperable and Reusable). Why sharing data is a good idea is covered in Motivations for sharing and archiving data and Open Data.

The choice of repository can depend on multiple factors:

You can search for relevant repositories on re3data and FAIRsharing. However, a search will likely result in a long list of repositories, which you will need to narrow down. The following questions may help you with that:

See the ARDC’s Guide to choosing a data repository or the DCC checklist for evaluating data repositories for more information.

Types of repositories

If your discipline does not have a disciplinary specific repository you can make use of several general repositories. Below follows a (non-exhaustive) list of these different types of repositories:

General purpose repositories

Project repositories

Generic data repositories

Institutional or National repositories

Many countries and/or institutions also provide access to repositories that you could use. Check with your local Research Data Management support to see if this available at your institute, or try to search for such a national repository using re3data and FAIRsharing.

Several lists of Recommended Repositories by publishers exist:

Example: Open Science Framework (OSF)

The OSF is a free open-source software project that facilitates open collaboration in science research. OSF is way more than a data repository or an archive; it is a collaboration tool which can be used by research teams to work on projects privately or openly, similar to GitHub. This case study highlights OSF as one of the repositories for everything that you can choose to store your research output long term and make it citable through getting a persistent identifier. An example of what you could share on the OSF is a research compendium. Get started with the OSF by using the introduction on their website.

OSF access management

OSF helps to control levels of access you want to give to different people. This can be achieved through OSF folder structure that allows to assign different privacy settings to different folders within one project. In OSF terminology, these folders with custom privacy settings are called components. OSF has servers in Europe which allows compliance with the GDPR.

OSF and FAIR principles

The following functionality of OSF helps to make such a folder FAIR (Findable, Accessible, Interoperable and Reusable):

Additional OSF resources

References
  1. Lin, D., Crabtree, J., Dillo, I., Downs, R. R., Edmunds, R., Giaretta, D., De Giusti, M., L’Hours, H., Hugo, W., Jenkyns, R., Khodiya, V., Martone, M. E., Mokrane, M., Navale, V., Petters, J., Sierman, B., Sokolova, D. V., Stockhause, M., & Westbrook, J. (2020). The TRUST Principles for digital repositories. Scientific Data. 10.1038/s41597-020-0486-7
  2. The Turing Way Community, & Scriberia. (2024). Illustrations from The Turing Way: Shared under CC-BY 4.0 for reuse. Zenodo. 10.5281/ZENODO.13882307