Semantic Search in Web Archives

Dr John Moore, National Archives

In this talk, we will introduce some of the challenges faced at The National Archives when working with data at scale and how AI can help with practical challenges such as supporting OCR of documents and document selection. We will provide some practical examples and demonstrate the benefits of using an LLM to support searching our Web Archive. We will highlight key challenges of working with the metadata required to support provenance and explainability. Finally, we will discuss some of the environmental challenges and engineering challenges we are likely to face conducting this work.

Back to the itinerary