Archive Data for Future Use

The archive of data for future use is an extremely valuable task for researchers. By putting systems in place to store de-identified data so that they can be accessed for verification purposes or for further analysis and research in the future, researchers can extend the range of the data collection efforts and encourage future innovation and collaboration.

Data can be archived for future use in a number of ways. It can be uploaded onto a website, stored within an institutional repository, or deposited within a specialist data archive centre.

The advantages of depositing data with a data centre are that data will be protected, regularly maintained and converted to different software formats as technology advancement requires. It will also manage any restricted access requirements to sensitive data and allow for easy public dissemination and access (Van den Eynden et al. 2011, p. 4).

Advice

Advice for USING this option

When creating a data management plan for the archiving of your data, it’s important to consider the following:

  • Data documentation (Van den Eynden et al. 2011, p. 9):
    • Original project aims and objectives
    • Data collection methodology
    • Information on hardware and software used in collection and analysis
    • Dataset structure of files
    • Data quality and data cleaning processes
    • Access restrictions and confidentiality information
    • Variable names, labels and descriptions
    • Codes and classification schemes
    • Definitions of terminology
    • Missing values
    • Weighting and grossing variables
  • Converting data into file formats suitable for long-term, widely accessible storage
    • ​​From Van den Eynden et al. (2011, p. 12)​:
    • ​Quantitative data with extensive metadata: 
      • SPSS portable (.por)
      • Delimited text and command ('setup') file, containing metadata
      • Structured text or mark-up file (e.g. DDI XML file) containing metadata information
    • Quantitative data with minimal metadata:
      • ​​Comma-separated values file (.csv)
      • Tab-delimited file (.tab), including delimited text of character set with SQL data definition statements if necessary 
    • Geospatial data
      • ESRI Shapefile
        • Necessary: .shp, .shx, dbf
        • ​Optional: .prj, .sbx, .sbn
      • Geo-referenced TIFF (.tif, .tfw)
      • CAD data (.dwg)
      • Tabular GIS attribute data
    • Qualitative data
      • eXtensible Mark-up Language (XML) text with appropriate Document Type Definition (DTD) or schema (.xml)
      • Rich Text Format (.rtf)
      • Plain text data, ASCII (.txt)
    • Digital Image Data
      • ​​TIFF version 6 uncompressed (.tif)
    • Digital Audio Data
      • ​Free Lossless Audio Codex (.flac)
    • Digital Video Data
      • ​​MPEG-4 (.mp4)
      • motion JPEG 2000 (.jp2)
    • Documentation
      • Rich Text Format (.rtf)
      • PDF/A of PDF (.pdf)
      • OpenDocument Text (.odt)
  • Organization of files and folders  (Van den Eynden et al. 2011, p. 13-4)
    • File and folder names should be short but meaningul
    • It’s useful to include some metadata in the file names, such as dates of creation and modification and file type
    • Folders could be organized by:
      • Data type
      • Research activity
      • Material
  • Data quality
  • Secure storage of data 
    • Storage facilities should be safe from fire and flood
    • Digital media storage should be regularly checked and upgraded every two-five years
    • Controlled access should be put in place to avoid non-authorized use
  • Ethics and consent
    • Informed consent
      • Participants should be made aware of how their data will be stored, used and archived in the future, and what measures will be taken to protect their anonymity before they give their consent
    • Anonymizing data 
      • Easiest to do this during the process of collection and analyzing, rather than at the end of the project
      • Keep an original copy of the data and a record of changes made, which should be stored separately to the edited files
      • Identifying information should be removed or aggregated
      • Generalize specific details
      • Replace named persons with pseudonyms

Resources

Guides

Managing and Sharing Data (UK Data Archive): This guide gives researchers an overview of the best practice involved in data management for long-term, accessible data storage, protection and archive.

Data Archive Centres

ICPSR: An international consortium of academic institutes and research organization. Provides access to data archives, and information and research on data analysis methodology and data curation. This site also contains links to a number of tools and services for ensuring the confidentiality of data, including a web-based program to anonymize qualitative data, tools for the enclave of restricted-use data and conducting security reviews, as well as data processing and dissemination tools.

Australian Data Archive: The Australian Data Archive is a nation service provided by a range of universities around Australia with facilities for the collection and archive of digital research data.

GESIS - Leibniz Institute for the Social Sciences - Data Archive: Focuses on German and international comparitive surveys in the fields of social and political science. The data is made accessible to public and researchers, and must meet methodological and technical standards.

UK Data Archive: The UK's largest collection of digital research data in the social sciences and humanities. The website contains information for users on creating, managing and finding data, as well as resources on best practice for archiving and storing data.

Sources

Van den Eynden, V., Corti, L., Woolard, M., Bishop, L. and Horton, L. (2011). Managing and sharing data: Best practice for researchers. UK Data Archive, University of Essex: Essex.

 

Updated: 11th August 2015 - 11:07am
This Option is useful for:
A special thanks to this page's contributors
Author
BetterEvaluation Website and Engagement Coordinator, BetterEvaluation and ANZSOG.
Melbourne, Australia.

Comments

There are currently no comments. Be the first to comment on this page!

Add new comment

Login Login and comment as BetterEvaluation member or simply fill out the fields below.