This project is one of many in the Library where we need to liberate analog data, in this case from paper, into a digital medium that reserachers can use. The material for this project is a series of reports produced over a period of 23 years, often handwritten, sometimes typed that are part of the CalCOFI Hydrobiological Survey of Monterey Bay.
In 1951, the Hopkins Marine Station of Stanford University became a partner in the California Cooperative Oceanic Fisheries Investigations (CalCOFI) program in order to collect oceanographic data in and near Monterey Bay. The aim of the program was to conduct joint fisheries-oceanographic cruises that would help researchers understand what contributed to observed fluctuations in the California sardine fishery. Hopkins conducted weekly sampling (more or less) continuously from March 1951 through June 1974. The raw and aggregated data for most of these cruises currently reside in analog form (handwritten data logs, annual reports, etc.) in the library at the Hopkins Marine Station.
There are very few oceanographic time-series studies from the 1950s - 1970s, and these particular data only exist at our location. These data are an important contribution to studies in the marine sciences, climate change and coastal ecology. Our library is located in a tsunami zone, and since we have the only copy of these data, they are at significant risk of being lost.
Woods Hole Oceanographic Institution, Scripps Institution of Oceanography and Oregon State University are potential partners with similar material to be mined.
See the Collections as Data Facet for more information.
Researchers are beginning to understand the magnitude and complexity of the effects of climate change on our Earth system, and all research in this area is grounded in what we know about the past. Data collection at sea is labor-intensive and relatively rare, and technology has lowered that barrier only within the last couple of decades. In the marine sciences, the most valuable data collections are observational time-series studies, and the older the better. PDFs of legacy data are nearly worthless to a marine scientist who seeks to answer research questions.
Source Data: Image scans of original data sheets from Hopkins Marine Station (Miller Library; ca. 1951 - 1974) now in SDR.
The dataset includes variables such as temperature, salinity, oxygen, phosphate, silicate, phytoplankton and zooplankton community structure and abundance, meteorological conditions, fish and marine mammal counts, and more. The collection includes forty-four 3-ring or loose-bound notebooks, twenty-two small, bound notebooks, minutes from annual meetings, annual data reports, and other ephemera. The Hopkins CalCOFI collection is large, completely analog, and very heterogeneous.
The digitized items are not yet in the library catalog (also the discovery layer for the repository), but you can see a few examples of digitized material via direct links:
Actionable datasets: Conversion from PDF to actionable tabular data at scale
Transkribus
Outsourced hand-keying text
Crowdsourcing