Share and Preserve Research Data
Understand the benefits and constraints of sharing research data and contributing to the research community. Information about issues relating to the long-term preservation, licensing and sharing of your research data with other researchers.
Data sharing
Research data and materials can remain a valuable resource after the conclusion of the research project for which they were collected and used. Sharing the research outputs enables future researchers to open up new lines of enquiry without the duplication of effort involved in collecting the data again. Increasingly, funding bodies and journals are requiring the sharing of the data associated with the research that they fund or publish. This unit will explore the various reasons for and against, and benefits of data sharing.
The decision to share your data will require consideration of a number of issues relating to:
- Discovery
- Access
- Future use
- Ongoing management beyond the life of the project and/or your involvement in the project.
Consider what you need before making your data available to other researchers. This includes some of the requirements for the dataset's retention and long-term preservation, accessibility and options for licensing. In many cases, these requirements need to be considered early in the research process and during the data management planning stage. You may not yet have substantive research data to share, but you may be a user of someone else's data. Understanding the issues involved in data sharing will make you a more knowledgeable consumer of secondary sources of data.
Benefits and barriers to data sharing
Researchers devote a large amount of physical and intellectual effort to collect, manage, collate, and analyse their data before publishing their results. Many of these datasets have significant value beyond their original use, so sharing the data with others can be beneficial. Throughout your studies it is likely that you will benefit from the use of openly available secondary data.
Benefits
Some benefits associated with data sharing include:
- Promoting open scientific enquiry.
- Increasing exposure of work.
- Higher citation rates.
- Maximising taxpayer/public benefit from research.
- Ensuring transparency and accountability.
Barriers
There are also a number of perceived barriers to research which result from data sharing:
- Being “scooped” by a competitor.
- Sensitivity of data.
- Lack of recognition for making data openly available.
- Contractual or commercial interests associated with the data.
- It takes time and effort to make data shareable and often lacks the resources and funding to complete the process.
- Restrictions due to ethical concerns or lack of participant consent.
FAIR Data
The FAIR principles offer guidance on good practice steps that can be taken to make it easier for data to be cited and for research to be verified and reproduced, even if the data cannot be shared openly. The FAIR principles were introduced in an article in Scientific Data. in 2016. FAIR is an acronym that stands for:
- Findable – Data are findable if they are well described, indexed, and assigned a unique identifier such as a DOI. This can be achieved by depositing data in a data repository.
- Accessible – Making your data accessible may mean sharing it openly. However, if your data is sensitive, making your data accessible may instead mean publicising the ethical protocols that must be followed to access the data. This is sometimes referred to as the principle of being “as open as possible, as closed as necessary”.
- Interoperable – Interoperable data uses commonly accepted formats, languages, codes, and vocabularies.
- Reusable – Data that is richly described and has documented provenance information can more easily be reused, either by you in the future, or by other people. If you do intend for your data to be used by others, then you should also consider applying a license to it.
The Australian Research Data Commons (ARDC) provide further information about the FAIR data principles, including a FAIR data self-assessment tool.
Research funder guidelines for data sharing
ARC and NHMRC guidelines
Many research funders view research data as a public good, which should be openly available to the academic community and preserved for future re-use.
The Australian Research Council (ARC)
The ARC's Open Access Policy for ARC-funded research requires that:
“any publications arising from an ARC supported research project must be deposited into an open access institutional repository within a twelve (12) month period from the date of publication.”
In cases where it is not appropriate for publications to become public, the ARC Open Access policy states that:
“In cases where researchers may not be able to meet the requirements because of current legal or contractual obligations, Final Reports must provide reasons why publications derived from a Project, Award, or Fellowship have not been deposited in an open access institutional repository within the twelve month period."
More information on the ARC data management policy.
The National Health and Medical Research Council (NHMRC)
Similarly, the NHMRC and MRFF Open Science policy on the Dissemination of Research Findings states that:
"High-quality research that is rigorous, transparent and reproducible maximises the opportunity for improving health and wellbeing.
Open science supports and improves transparency, reproducibility and replicability of research, aiming to reduce waste, minimise unnecessary research duplication and maximise the benefits and value arising from research funding. Open science helps to increase the dissemination and improve accessibility of research, increase reuse of research data, improve research integrity and contribute to a stronger knowledge economy."
Although the issues and challenges of open access to publications are considerable, they are by no means impossible to cope with. In fact, it is becoming more common that in addition to publications, research data and materials are also made available for re-use via open access repositories and archives.
Developing a data management plan at the beginning of your project will help you identify the known characteristics of your physical and digital research data and materials, and make informed decisions regarding what you will do with them during and after your research project.
When and where it is possible or permissible to share your research data, there are significant benefits to be gained.
Read about the benefits of working with open data on the University's Open Scholarship website.
Data use controls
In module three we looked at how ethics, confidentiality and intellectual property rights (IPR) impact on research data management. Before considering making your research data and materials available for re-use, it is critical that ethical, legal and data ownership issues are identified and resolved. Check with your supervisor(s) and Faculty Research Office about your options. Record outcomes of these discussions in your DMP so that the identification of data ownership, ethical and access issues are documented. You may also find it useful to look at advice from the Research Industry and Commercial Engagement team regarding IPR matters. Access the IP and Licensing team via their website. Appropriate licencing of research data will protect the IPR and copyright of all interested parties while also gaining the benefits of access to research data where this is possible.
It is important to know what you are able to do with secondary data used in your research but also what other researchers may do with the data generated from your research. If you are re-using the research data of other researchers, then you will need to look at the agreement(s) governing the use of that data and any derivative that you create from your research. If working with sensitive data, review the Australian Research Data Commons ‘Publishing Sensitive Data Guide’. The guide will help you make clear decisions when sharing your data and making it available for reuse.
Once the IPR, confidentiality and other legal and ethical issues have been articulated you will be ready to seriously consider your options for managing access and re-use of your research data after your research project is completed.
You may feel more comfortable about releasing your research data if this release is carefully managed. It is not uncommon to place some restriction on access and re-use to research data post publication. For instance, assigning a licence to research data helps to protect the IP in that data and increases the opportunity to benefit directly from its use through attribution and citation while also tracking its ongoing impact over time.
Managing Access
Processes can be put in place to manage access to research data and outputs:
-
Embargoes are generally set for a specified period of time which enables the data creator right of first use. Typical periods range from six months to three years.
-
Data use agreements are between the data producer and the secondary data user and may be simple or quite complex. They outline who may use the data, how the data may be used and may include rules about re-use, storage, dissemination and disposal.
-
Creative Commons (CC) licences are some of the most commonly used licences, based on international legal standards but with some variations across jurisdictions. For example, a CC-BY licence requires that users attribute the work to the creator or author and give credit by citing the creator. Creative Commons Australia. provides useful information on the different forms of licensing available and ARDC provide a data supplier's flowchart to help you choose the right license. For more information, see the page on selecting a licence for your work on the Library's Copyright website or contact the Copyright team.
Open-source licenses are designed for use with software and code. Using Creative Commons licenses for software is not recommended. However, there are many other open-source licenses to choose from.
Bespoke licences are developed for a specific dataset or organisation. For example, individuals or organisations wishing to gain access to the HILDA social survey data require such a licence to do so.
Data retention and disposal
Under the University’s Research Data Management Policy (MPF1242), researchers must ensure that research data and records are retained and disposed of in line with the Records Management Policy (MPF1106), with minimum retention periods specified in the University Records Retention and Disposal Authority (RDA).
Researchers should refer to the ‘RESEARCH’ function in the RDA for a range of applicable classes and required retention periods.
Most research will align to temporary value classes in the RDA, meaning it can be disposed of after a specified period of time. Only a very small amount of research data is considered Permanent value, and should be retained and preserved forever.
If you are unsure of which RDA classes to apply to your research, contact Records & Information for guidance.
Digital preservation
What is digital preservation?
Digital preservation describes standards, best practices and technologies used to ensure long-term access to digital information via supporting:
- Persistence of authenticity and integrity of data over time
- Error-free persistent storage of the data
- Active means for retrieval and interpretation of data files
- Readily available data when required
- Ability to display and manipulate data when required.
Why preserve digital research data?
Some research data is valuable. Research classed as Permanent in the University Records RDA must be retained by the University forever.
Over time, digital research data and materials are subject to a number of risks which may result in data loss. These risks are intrinsic to the digital platforms and formats that researchers use for their data. As the software, hardware, and operating systems used to create digital research data change over time, loss of ability to access the data is a real possibility.
Consider research data stored on floppy disks, or in proprietary formats from software used in the 1990s: many of these data now require time consuming and expensive data recovery techniques, applied by data forensics specialists. These risks can be managed with effective digital preservation strategies such as saving copies of data in open formats and depositing research data into University-managed long-term data repositories and archives.
Read about Digital Preservation at the University of Melbourne
Data repositories for retention and sharing
Researchers are encouraged to deposit research data and records into suitable University facilities at the conclusion of a research activity/project, to support the meeting of retention obligations and facilitate discovery and preservation.
While you may share your data informally by emailing it to requestors or posting it to a website, informal methods of sharing make it difficult for researchers to find your data and access it now and in the long-term. The most common alternative to this is through depositing data into a repository.
Repositories for retaining data
Attica is the University’s long-term research data retention service. Inactive research data can be securely stored on this cloud-hosted platform to meet your regulatory or ethical compliance requirements around data retention.
Records Managed SharePoint (RMSP) is a version of MS SharePoint at the University, recommended for managing and retaining high value, high risk records. This platform is a good option for small-scale research projects with non-complex data sets, eg, MS formats and PDFs.
Repositories for publishing and sharing data
The University has a data repository where you can publish your data, Melbourne Figshare. You can sign into Melbourne Figshare using your University credentials.
Melbourne Figshare is an institutional repository that enables academic staff and graduate research students at the University of Melbourne to store, manage, publish and share digital materials. It allows data, non-traditional research outputs (NTROs, and other supplementary materials to be easily cited and discovered. Materials uploaded to Melbourne Figshare are kept locally on University managed storage infrastructure.
Materials published in Melbourne Figshare can be shared at the appropriate level of openness:
- Public, for materials that can be shared openly. Digital files are shareable, downloadable and reusable
- Temporarily embargoed, for materials that will be shared openly in the future, but cannot be shared openly at the present time.
- Permanently embargoed, for materials that cannot be shared publicly online. Information about the asset can be published either to facilitate access or so the output can be cited.
- Private link, for materials that are under a permanent or temporary embargo, or are unpublished but need to be shared privately with a supervisor, peer reviewer or collaborator.
The advantage of depositing your research data into a discipline specific repository or database specific to your domain is that it will be highly visible and accessible to researchers in your field, significantly increasing the chances it will be discovered and re-used. Check with your supervisor or take a look through the registry of research data repositories at re3data.org. to find trusted repositories in your field. Before you begin your project, it may be worth checking to see if the data you need has already been collected and is available to access from within a repository.
The major advantage of depositing your research data into a general repository is that it will become visible and accessible to researchers outside your specific field, widening the value of your research. The ARDC Research Data Australia service and the Australian Data Archive are two general alternatives which support Australian data sets while Zenodo is an international general repository which supports many different types of research data.