Linked Data Between Collections

What is it?

The internet has made it possible to achieve the impossible – the connection of information collections across the world instantly. Cultural heritage institutions such as libraries are becoming increasingly aware of the need to adapt to the changing modern world now governed by technology, collaborate, and partner with other institutions. This can be achieved by allowing resources to be open-source and openly accessible, which encourages the sharing of metadata, making both institution’s collections more discoverable for users by common search methods (Oberbichler et. al., 2022). This can only benefit the user and the cultural institution, but it is human nature to want to control information and keep that to ourselves.

Librarians are one of the responsible parties for the preservation of society’s cultural knowledge for future generations (Oberbichler et. al., 2022). The ability of the librarian to assist the patron to find the best resource for their needs, uncover meaning within items, and tell a story through linking data and resources between collections is one of the strongest ways information can be preserved for the future (Oberbichler et. al., 2022). Machine learning has the capacity to assist librarians to efficiently analyze very large amounts of data quickly, tagging items, and linking those items that are similar to create strings, patterns, and more metadata connections.

Use Case: The Massachusetts Institute of Technology (MIT) Library

The Massachusetts Institute of Technology (MIT) Library has created an advanced neural net, HAMLET, which was trained on the school’s graduate thesis collection in 2017 to teach the neural net how to create recommendations, among other functions (Yelton, 2019). Neural nets do not create categories of information, as the software is programmed to provide structures for the machine to make its own rules, to an extent, but machines are still unable to understand the context of information resources (Yelton, 2019). HAMLET currently includes services such as a recommendation engine which is a search engine for theses, but also recommends those similar to the one chosen, the uploaded file oracle, which allows the user to upload a single document which the neural net then interprets on the fly and returns theses similar to the chosen document, and the literature review buddy, which returns sources that are most similar to the these being produced, and suggests more to be potentially added to the document (Yelton, 2019). Libraries that feature such innovative machine learning based engines provide their user population with more possibilities for success in their academic endeavors than those that do not offer these services at all.

Use Case: The Massachusetts Institute of Technology (MIT), the British Library, the University of Illinois Archives, and the American Philosophical Society

The Massachusetts Institute of Technology (MIT), the British Library, the University of Illinois Archives, and the American Philosophical Society led a collaborative initiative between 2017-2019 to explore computational approaches to create an open access archive of data from four founding computer scientists of the cybernetics movement (Anderson, 2021). The initiative was called The Cybernetics Thought Collective: A History of Science and Technology Portal Project and aimed to achieve a common goal: create data from the communications of these founders using technological advances in artificial intelligence (Anderson, 2021). Open access to these communications was planned using three methods: metadata created from digitizing archived records, data that could be reused for future scholarship, and data that would create various visualizations to continue this collaboration going forward (Anderson, 2021). The project’s main goal was to figure out how to be an example of collaboration with other knowledge institutions to form a model for the creation of a “community of data and records” to openly share (Anderson, 2021). Conclusions from this collective experience showed that metadata and the digitization of archival collections to be shared academically between institutions should be viewed as a complement to the traditional archival model, but not as a replacement (Anderson, 2021).

References

Anderson, B. G. (2021). On constructing a scientific archives network: Exploring computational approaches to the cybernetics thought collective. Archivaria, 91(91), 104-147. doi:10.7202/1078467ar

Oberbichler, S., Boroş, E., Doucet, A., Marjanen, J., Pfanzelter, E., Rautiainen, J., . . . Tolonen, M. (2022). Integrated interdisciplinary workflows for research on historical newspapers: Perspectives from humanities scholars, computer scientists, and librarians. Journal of the American Society for Information Science and Technology, 73(2), 225-239. doi:10.1002/asi.24565

Yelton, A. (2019). Chapter 2. HAMLET: Neural-Net-Powered Prototypes for Library Discovery. Library Technology Reports, 55(1), 10–. https://uosc.primo.exlibrisgroup.com/permalink/01USC_INST/273cgt/cdi_proquest_journals_2161879773

Page updated

Report abuse