Determine your storage needs

… to find appropriate storage services


Here are three issues to consider.

Permanence     Oversight     Security 
How long do you need to store the data? 3 to 5 years? 10 years? Forever?

Will you destroy obsolete data?
Which versions of the data will you store as an official long-term copy?

Who will manage the data after project completion, over time, and across personnel changes? 
Are there ethical requirements for secure data storage (e.g., IRB, HIPAA)?

Do you need to discard or destroy any private, personal, confidential data after project completion?






Find a storage space

… that suits your needs for data control and sharing 


Data storage options  

 Option  Examples Pros (and Cons)
Personal computer Internal or external drive

CDs and DVDs are not recommended
Personal control, but personal responsibility for theft, loss, and backups
Departmental or university servers UC Berkeley's IST data centers and servers and data services   Managed and may have automated backups
Institutional repository/archiveMerritt at UCLong-term storage controlled by the hosting institution
Public database/repository GenBank
Long-term storage with access for the general public
Cloud storage Amazon S3
Google Docs
Dropbox
Online accessible, but sensitive data may be vulnerable with third party services

How much storage space? 
  • Consider the growth rate of data and how frequently it changes. 

Tips

  • Uncompressed data are best, though it's okay to compress a third copy of the data.
  • Unencrypted data are best, though encryption is appropriate for sensitive data or a third copy.





Save in a file format for long-term access

... so your data can be opened and read in the future


Type of Document Not ideal Ideal
Text MS Word PDF
Spreadsheet MS Excel CSV
Image GIF, JPG TIFF
Sound AAC (iTunes) WAV
Video Quicktime MPEG-4
Databases MS Access XML or RDF


Here are more file format recommendations.

In general, use a file format with these features:
  • Non-proprietary 
  • Unencrypted and uncompressed 
  • Open, documented standard (e.g., PDF, XML) 
  • Common usage by your research community 
  • Standard representation (e.g., ASCII text, Unicode) 
Look for discipline-specific standards for file formats (e.g., there are standards for environmental and social science data). 






Name data files and folders descriptively

… so you'll find your data quickly

Avoid ambiguous file names

data1.csv
Use descriptive file names and be consistent

75-celsius-trial_exp-group_original.csv

75-celsius-trial_control_ver002.csv


Consider these terms for file names:
  • project title
  • experimental conditions and group
  • trial numbers
  • file version number indicating data modifications
  • date or time stamps
  • author initials


Avoid ambiguous and unorganized file directories

Data
 > 1
 > raw
    >> part A
    >> 110904
 > readings
Organize files in a descriptive folder structure


 Project-title
 > Trial 1
    >> Experimental
    >> Control
 > Trial 2
 > Trial 3


Tips





Secure your data

... so only you and your team have full access


  • Secure portable storage devices - like laptops and flash drives - from lost, theft, or damage

  • Consider the security of third party storage services and beware of inappropriate management - especially with sensitive or confidential data

  • Secure rooms with computer hardware and use laptop locks

  • Use authentication systems - like password protection for a computer 

  • Encrypt files that contain sensitive, confidential, or private data.  Here are some encryption tools.

  • Record passwords and encryption keys - but be sure to store them safely.

    • For example, record passwords on paper and lock in a file cabinet (2 copies).  Alternatively, use software like Password Safe.





Back up your data

… so you don't lose your hard work


Have 3 copies Where to store?
1. Original "master" copy On your primary computer or network
2. Local external storage On an external hard drive in your lab or office
3. Remote external storage

(i.e. a physically removed location)


UC Berkeley IST storage and backup services 

Cloud storage via third party companies

Be mindful of security threats when storing private, confidential, and sensitive data on third-party services.


Tips
  • Check file recovery at setup and on a regular schedule
     
  • Check that older files are still readable and accessible.  If necessary, migrate older files to a format that offers long-term access.