Four data categories are being managed by this project (see table below). Data is: open-access (stored where freely accessible to the scientific community/public), limited-access (stored for specific project participants), or restricted-access (stored for the PI and specifically designated personnel). The project is managed by McClatchey with specific Senior Personnel reporting for subcategories.
Data Management/Archiving: Atrium (http://www.atrium-biodiversity.org) is a biodiversity information management platform developed by staff of BRIT for merging research collection data with species information via species pages, literature citations, GIS layers, ecological data, environmental data, and images. Atrium also provides tools for integrating, managing, sharing, and publishing biodiversity and allied data via the internet. The Atrium data model also supports the management of collections with different kinds of specimen records (i.e., verbal-ethnographic interviews (wav)(odt,doc,docx,wpd), textual records (pdf)(odt,doc,docx,wpd), herbarium specimens (Forman & Bridson 2000), apple juice/ cider/ vinegar, leaf bud samples) and duplicate specimens can be represented as individual items linked by a collection number. Quantitative data (e.g., apple variety presence/absence, orchard size) associated with each sample will be entered through customized form fields. Finalization of these standard data fields will be determined in an early meeting of Senior Personnel. Transformed data can also be loaded into Atrium that represents relationships between groups of data within the system. The Atrium data model also incorporates important interfaces for management and display of different versions of data (i.e., current and previous versions), a technique that allows annotators to update and add data, but not lose the original data as entered in the field or from legacy data available from field logs and original uploaded datasets.
Atrium adheres to TDWG data standards, including data elements defined by Distributed Generic Information Retrieval (DiGIR), Access to Biological Collections Data (ABCD), and the Darwin Core. Globally Unique Identifiers (GUID) are used to track all records. Image metadata are stored following the EXIF and IPTC standards. Atrium supports bibliographic data imported and exported in ISI and EndNote (XML) formats, as well as the OpenURL standard for locating resources. Metadata for GIS datasets stored in Atrium comply with the ISO 19115 standard. With standardized data at the core of Atrium, web services such as DiGIR, TAPIR (TDWG Access Protocol for Information Retrieval), and RDF (Resource Description Framework) can facilitate the creation of distributed data networks, which will greatly expand the extent of the dissemination of data and increase the value to other organizations and individuals.
Long-term archiving and single system failure are being avoided in two ways. First, complete data copies of Atrium are pulled weekly, labeled by date, and stored off-site in non-web-accessible digital storage (hard drives) rotated as part of information lifecycle management. 2 Two back-up weeks are stored. Copies of open-access data will be offered to: Jardín Botánico de Castilla La Mancha, Albacete, Españia; Servicio Regional de Investigación y Desarrollo Agroalimentario, Villaviciosa, Españia; Cider Museum archives, Hereford, UK; USDA Plant Genetic Resources, Geneva, NY; and Istituto Agrario San Michele all’Adige Research and Innovation Centre, Trento, Italy. An acid-free paper project data paper printout will be generated once every year for permanently storage in BRIT archives. Duplicate herbarium specimens, apple juice/cider/vinegar, lead-bud samples will be stored in the BRIT herbarium and in local herbaria in host countries where each sample was generated so that the BRIT collection will be the only complete set.
Data and sample flow: As sub-categories of data are collected/made, digital copies (or descriptions) are uploaded into Atrium using, either off-line or on-line, pretested project templates. All data are georeferenced and assigned unique identifier numbers when entered in Atrium so physical and digital information collected from the same interview events can be tracked with each other. Two centers will coordinate the flow of data and samples to BRIT then to the collaborating senior personnel. McClatchey will oversee N. America from BRIT, while Savo will oversee Europe from Rome, Italy. McClatchey/Savo will manage supplies, workshop coordination, ongoing research activities, physical sample receipt and redistribution, and data entry into Atrium. Best will oversee data storage, archiving and redistribution speed. Bridges will focus on analyses, pushing researchers to analyze/publish basic/transformed data.
At the first Senior Personnel meeting an authorship agreement will be completed and signed. Points will include: 1) Data ownership use right (referenced to the researcher or institution) will be exclusive for 6 months, then anyone may use data but must cite Atrium and the data contributor with reference to open-access standards. 2) Data users must not simply acquire the data and manipulate it, but contact the data contributor and provide fair opportunity for participation in publications as a contributing coauthor. Data contributors will not restrict use of data if they are not able to participate as coauthors.