Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.
Selecting for Preservation
Why select and appraise
It is not possible for all digital data to be kept forever but increasingly there is a view that ‘‘storage is cheap so why don’t we just decide to keep everything”. While that may in theory be technologically possible in practice there are four main objections to this view
1. Digital content expands. Even if storage costs go down, both will cancel each other out, or costs could even go up.
2. Backup and mirroring increases costs. No digital preservation approach can survive without appropriate mirroring and backup systems.
3. Discovery gets harder. Keeping everything means increasing noise, requiring additional effort to ascertain which data is the intended target of a search.
4. Managing and preserving is expensive. Unnecessary preservation adds unnecessary costs.
Paraphrased from "How to Appraise & Select Research Data for Curation" DCC
What to Select
So, preservation and what you plan to keep requires some thought. Start planning for long term preservation of your data from the outset of your project. You will need to build in preservation planning early on and adjust it to any research outcomes that emerge during the data collection and processing stages. DataONE.org suggest the following initial criteria for identifying datasets that should be considered for preservation...
- Only the datasets which have significant long-term value should be contributed to a repository.
- If data cannot be recreated or it is costly to reproduce, it should be saved.
- Four different categories of potential data to save are observational, experimental, simulation, and derived (or compiled).
- Your funder or institution may have requirements and policies governing contribution to repositories.
In your plan...
- List what data will be preserved. (be cognisant of foreseeable research uses for the data).
- Outline how long will the data be retained and preserved.
- Specify what preservation file formats will be employed (see OpenAire "Data formats for Preservation")
- Account for additional costs associated with keeping the data.
- Explain your rationale for choosing the archive/repository of choice (see "How to Select a Repository").
For an expanded explanation and more DCU Supports regarding Preservation and Sharing please see the DCU RDM Guide Preservation or Sharing Sections.
Sharing & Archiving Data
Reasons for Sharing Data
- Impact & longevity: Your data may be cited by others. Open publications and data receive more citations, over longer periods
- Compliance: Funders, publishers and institutions may require that you share your data
- Transparency & quality: Your findings can be replicated and compared with other studies
- Collaboration: creates opportunities for follow on research and collaboration
- Re-use: Your data can be used in novel ways. Data sharing facilitates re-use of your data for future / follow-on research and discovery as data collection can be funded / collected once, and used many times for a variety of purposes
- Efficiency: Data sharing is good research practice!
There may be reasons for not sharing your data e.g. privacy and confidentiality issues, commercial value of the data. Horizon 2020 has coined the phrase “As open as possible, as closed as necessary.” If you are unable to publicly share your data, consider the possibility that you may wish to make your data available internally to future researchers to facilitate follow-on research, and/or to create a metadata record in your chosen archives or repository. A metadata record will describe your data and aid others in knowing about it. In order to ensure this can happen you will need to manage your data.
For an expanded explanation and more DCU Supports regarding Sharing & Archiving please see the DCU RDM Guide, or visit the "Archiving Datasets" libguide
DCU RDM Guide
Comprehensive Guide to all Supports, Tools, & Resources available to DCU Researchers, at all stages of the data lifecycle. Provided by DCU Research Support, DCU Library, and ISS
DataONE Identify data with long-term value
OpenAire Formats for Preservation