Summary of the March -April ai4lam Workshop Series

16 June 2021


Five workshops took place from the 29th of March to April the 2nd, 2021. The workshops were organised by AI4LAM (Teaching and Learning Working Group) and co-hosted by LIBER and the BnF. The aim of the workshops was to provide training opportunities for those interested in applying and deploying Artificial Intelligence (AI) in Libraries, Galleries, Archives, and Museums. The series brought together a diverse community of experts with subject and domain expertise, as well as technologists across GLAM institutions for a collaborative learning event to share tools and experiences and to reflect on the process of applying AI and its implications for GLAM institutions.

Details of the Workshops

You can find the details of each workshop below by following the link below:

https://libereurope.eu/article/new-workshop-series-applying-and-deploying-artificial-intelligence-ai-in-glams/

Speakers :

  • Claudia Engel, Stanford University.

  • Catherine Nicole Coleman, Stanford University Libraries


What is Machine Learning and what role does data play? How is the data used to train AI models? What kind of decisions do librarians make in the process of data curation and how might these impact machine-generated predictions? This workshop took a project-based approach to explore the importance of data curation in machine learning outcomes.

The speakers explored how data is thought of in a data science context and how that maps to library practices. Concerns of data bias are considered including data provenance (What is the history and context of the source material?), defining the boundaries of a data set (Who is in and who is out?) and feature engineering (What is relevant and what is not?).

* this workshop has not been recorded

* Zenodo link to the powerpoint : https://zenodo.org/record/4749422#.YMckLW0zaUk

Speakers :

  • Mark Bell, The National Archives of UK

  • Mike Trizna, The Smithsonian Institution

  • Nora McGregor, The British Library

  • Daniel van Strien, The British Library


In this workshop, Mark Bell, Mike Trizna, Nora McGregor, and Daniel van Strien beta tested the lesson that is currently in the early stages of development and is to become a part of the Library Carpentry Curriculum. The workshop aims to empower GLAM (Galleries, Libraries, Archives, and Museums) staff by providing the foundation to support, participate in and begin to undertake in their own right, machine learning-based research and projects with heritage collections.

* Youtube link to the record : https://youtu.be/oM6pz7iqses

* Zenodo link to the powerpoint : https://zenodo.org/record/4700664#.YMckpW0zaUk

Speakers :

  • Estelle Bunout, Luxembourg University

  • Maud Ehrmann, EPFL Digital Humanities Laboratory in Lausanne.

Extracting content via text mining and making it accessible for scholarly research has been often discussed in the past decade, but the noisy output has stiffened its realisation. ‘Media Monitoring of the Past’ is an interdisciplinary research project in which a team of computational linguists, designers and historians seek to integrate text mining in historical research workflows. The project uses the datafication of a multilingual corpus of digitized historical newspapers from various transnational European collections. The findings of this project have been used for the creation of the impresso app, which allows researchers to explore, use, and share the historical texts from the project’s corpus.

* Youtube link to the record : https://youtu.be/fuGYc_svLXg

* Zenodo link to the powerpoint : https://zenodo.org/record/4748906#.YMck1W0zaUk

Speaker :

  • Dr. Giles Bergel, University of Oxford - University College London


This workshop introduced the use of visual AI for collections research, access and management. Using the example of collaborations between Oxford’s Visual Geometry Group (VGG) and researchers and curators within the GLAM sector, the speaker provided a hands-on introduction to VGG’s open-source tools for visual search, classification, comparison and annotation. The workshop provided a benchmark for the state of the art in visual AI across the sector, and discussed critical and ethical issues around privacy, bias and accreditation.

* Youtube link to the record : https://youtu.be/Ku7IZ1lNICk

* Zenodo link to the powerpoint : https://zenodo.org/record/4749437#.YMck-W0 zoUk

hosted by the BnF

Speakers:

  • Audrey Altman, The Digital Public Library of America

  • Michael Della Bitta, The Digital Public Library of America


Apache Spark is a popular, open-source machine learning tool with many relevant applications for the GLAM community. With a local deployment and a modestly-sized dataset, it can be used as a teaching/training tool for machine learning. It can also be used as part of a production system or research project with very large datasets and can be deployed economically to cloud clusters. This workshop demonstrated Spark’s machine learning capabilities, and helped participants determine if it would be a good fit for their projects.

* Youtube link to the record : https://youtu.be/wCMkeEYBF34

* Zenodo link to the powerpoint : https://zenodo.org/record/4749453#.YMclGm0zaUk