Data Management
Hydrological data is essential for understanding and managing various activities in watersheds, however collecting data is time consuming and costly. It is thus extremely important to properly manage the data once they are collected. A data management plan should be drawn up and communicated to all personnel, so that every step is adhered to. This minimizes chances of data entry error, errors in converting data from one format to another, misplaced data files, loss of data, inability of future software to read collected data, all of which can seriously compromise data quality and availability for analysis. As volumes of data increase, so does the need for management.
A data management plan should include the following items:
Data entry and saving the data in a format that is common or standardized across every basin. This not only allows compiling all data within a basin from individual stations, it also allows compiling data from all the basins for a national level database. Ideally this format is specified by the central authority.
Save copies of Microsoft Excel files as csv text files. Text files can be read by any program, and have the greatest probability of being read by future software. They also take up very little space.
Filenames should be descriptive, so that one need not always have open a file to see what data is stored. This is an issue especially when there are many files. Example: tuyucu_rain_2003. Filenames should not contain spaces, instead use underscore.
Create a directory with well organized folders to save the individual data files (.xls and .txt). Ideally, this should be C:/hydrodata/…/…/….
For example, C:/hydrodata/rainfall/basin_name/station_name/year2002/
Saving files in different places can lead to different versions of the same file, which can lead to error if an older version of the file is used.
Designate persons to backup data. Whenever a data file is obtained, it should be saved in at least three places – the workstation computer hard drive, an external hard drive (or flash drive until an external hard drive is obtained), and offsite. Offsite can mean a computer in another building, or cloud-storage on the internet (such as dropbox.com). If the volume of data generated is high, then a daily backup is encouraged. If not daily, there should be a weekly or bi-weekly backup routine, and specific persons should be assigned with this responsibility.
The entire directory should be backed up. Unless the volumes of data are very high, it is better to backup manually. There are also automatic backup software available, however care should be taken that these software automatically choose the latest version of a file. If for some reason the user would like to preserve an older version of the file, it can either be given a new name, or placed in a new folder.
The simplest form of backup involves connecting an external hard drive (or another computer on the network if there is one), and saving the entire directory. Files may be updated in various folders, and hence it is better to save the entire directory each time, to avoid the chance of missing out on some file.
It is also a good idea to periodically open the backed-up files to ensure that data is not being corrupted in the file copying process.
Documentation:
“Say what you do, do what you say”
it is important to maintain some metadata that describes in a document, who collects the data, how is it collected, where it is stored, how it is stored, whether there are any special procedures followed. Who are the designated data managers/backup personnel. The purpose of documentation is to maintain smooth running of data-based activities if personnel leave.
The first step of any research and management begins with data collection. Proper data management is critical in assuring collection and availability of high quality (error-free ) data for data analysis and water management decisions.
1. Standardized collection: VERY often, data which is the foundation of scientific research and resource/ecosystem management is collected by different entities in different ways, and therefore cannot be compared with each other. Therefore, a project or organization needs to standardize the data collection method as well as formatting for storage.
2. Storage - database management systems.
how to collect data in a uniform format, how to create a simple database in Access or OpenOffice Base, how to save data as txt files that be archived and shared. Emphasis is for the setting up of data collection and mgmt programs in developing countries.