The First Workshop on Large Language Model Memorization
The First Workshop on Large Language Model Memorization
May 16, 2025: the ARR commitment page is live, submit by May 20
January 13th, 2025: Call for Papers out now!
January 13th, 2025: Call for Papers out now!
December 18th, 2024: Website is live
Large language models (LLMs) are known to memorize their training data. In recent years, this phenomenon has inspired multiple distinct research directions. Some researchers currently focus on understanding LLM memorization, attempting to localize memorized knowledge or identify which examples are most likely to be memorized. Other researchers aim to edit or remove information that an LLM has memorized. Still, others study the downstream implications of LLM memorization, including legal concerns associated with memorizing copyrighted articles, privacy risks associated with LLMs leaking private information, and benchmarking concerns that LLMs are memorizing test data.
The First Workshop on Large Language Model Memorization (L2M2), co-located with ACL 2025 in Vienna, seeks to provide a central venue for researchers studying LLM memorization from these different angles.
Title: On Testing Memorization in AI: From Brute-Force Methods to Robust Statistical Tests
Abstract: I will present various approaches for testing memorization and will discuss their strengths and limitations. In particular I will cover proof-of-concept brute-force extraction attacks, high-power membership inference attacks, range membership inference attacks, and recent developments in dataset inference attacks.
Bio: Reza Shokri is a research scientist at Google (and on-leave professor of computer science at NUS). His research focuses on security and privacy in AI. He is a recipient of the Asian Young Scientist Fellowship 2023, Intel's 2023 Outstanding Researcher Award, the IEEE Security and Privacy Test-of-Time Award 2021, and the Caspar Bowden Award for Outstanding Research in Privacy Enhancing Technologies in 2018.
Title: Memorizing Distributions: Generative Models Recall More Than Single Examples
Abstract: Research on memorization in generative models often asks a narrow question: can the model reproduce a training example word-for-word (or pixel-for-pixel)? Yet verbatim copying is rare. Much more often, models internalize entire statistical distributions—the pastel color palette of a Miyazaki frame, the gender ratios hidden in image captions, or the recurrent POS-tag templates of newswire prose.
In this talk, I will present several works that studies distributional memorization of generative models, illustrating how they illuminate different facets of the phenomenon, its ties to generalization, and the frequency-driven factors that determine what gets memorized. I will unpack the practical challenges involved—choosing meaningful concepts, designing sound measurement protocols, and scaling analyses to web-scale data. I will conclude with the open questions and challenges that remain, and (hopefully) convince the audience to start thinking and working on distributional memorization!
Bio: Yanai Elazar is a Postdoctoral Researcher at AI2 and the University of Washington. Before that, he did his PhD in Computer Science at Bar-Ilan University. He is interested in the science of generative models, for which he develops tools and algorithms for understanding what makes models work, how, and why, with special interest in the data on which these models are trained.
Title: Emergent Misalignment Through the Lens of Semantic Memorization
Abstract: As AI agents increasingly operate across multiple modalities and languages, they introduce novel privacy and memorization risks that current evaluations fail to capture—with existing studies remaining largely monolingual and focused on single modalities, creating a dangerous blind spot in our understanding of data leakage. In this talk, we present a new, memorization-based lens for studying emergent misalignment. First, we examine assisted memorization—a concerning form of misalignment where models finetuned on non-overlapping datasets begin producing personally identifiable information they would not have exposed in their base form, revealing how seemingly benign adaptations can create unexpected privacy vulnerabilities. Second, we demonstrate cross-lingual leakage, showing how models leak training data when prompted in languages different from the original training content. Finally, we introduce a novel attack leveraging non-literal memorization of text to extract training data from multi-modal models, with successful demonstrations on systems like Yue (lyrics to music) and Google’s Veo 3. Looking ahead, our findings highlight the need to shift toward more dynamic benchmarks that can capture these nuanced forms of information leakage across modalities and languages, while developing protection methods that address the full spectrum of emergent memorization behaviors.
Bio: Niloofar Mireshghallah is an incoming assistant professor at CMU (EPP & LTI) and a Research Scientist at FAIR. Before, she was a post-doctoral scholar at the Paul G. Allen Center for Computer Science & Engineering at University of Washington. She received her Ph.D. from the CSE department of UC San Diego in 2023. Her research interests are Trustworthy Machine Learning and Natural Language Processing. She is a recipient of the National Center for Women & IT (NCWIT) Collegiate award in 2020 for her work on privacy-preserving inference, a finalist of the Qualcomm Innovation Fellowship in 2021 and a recipient of the 2022 Rising star in Adversarial ML award.
Ameya Godbole
Anshuman Suri
Anvith Thudi
Aryaman Arora
Avi Schwarzschild
Bihe Zhao
Boyi Wei
Chen Bowen
Chiyuan Zhang
Cristina Lopes
Dongjun Jang
Fan Wu
Igor Shilov
James Flemings
Jing Huang
Juraj Vladika
Kanyao Han
Leonardo Ranaldi
Lucie Charlotte Magister
Marco Bombieri
Marius Mosbach
Martin Tutek
Masaru Isonuma
Matthew Finlayson
Matthew Jagielski
Matthieu Meeus
Max Ploner
Md Nishat Raihan
Mohammad Aflah Khan
Patrick Haller
Peter Carragher
Ryan Wang
Shotaro Ishihara
Skyler Hallinan
Ting-Yun Chang
Vikas Raunak
Xiangyu Qi
Yuzheng Hu