CRAC 2024, the Seventh Workshop on Computational Models of Reference, Anaphora and Coreference, was held at EMNLP 2024 (in Miami, Florida, on November 15).
Since 2016, the yearly CRAC (and its predecessor, CORBON) workshop has become the primary forum for researchers interested in the computational modeling of reference, anaphora, and coreference to discuss and publish their results. Over the years, this workshop series has successfully organized five shared tasks, which stimulated interest in new problems in this area of research, facilitated the discussion and dissemination of results on new problems/directions (e.g., multimodal reference resolution), and helped expand the coreference community that used to be dominated by European researchers to include young researchers from the Americas.
The aim of the workshop is to provide a forum where work on all aspects of computational work on anaphora resolution and annotation can be presented.
The workshop welcomes submissions describing theoretical and applied computational work on anaphora/coreference resolution. Topics of interest include but are not limited to:
coreference resolution for less-researched languages
annotation and interpretation of anaphoric relations, including relations other than identity coreference (e.g., bridging references)
investigation of difficult cases of anaphora and their resolution
coreference resolution in noisy data (e.g. in social media)
new applications of coreference resolution
CRAC 2024 also featured presentation of the results of the Shared Task on Multilingual Coreference Resolution. Please find the list of accepted Shared Task papers and the program of the Shared Task session below.
Shared Task submission deadline: June 24, 2024
Shared Task evaluation results: after June 24, 2024
Regular paper submission deadline: September 1, 2024
Shared Task system description submission deadline: September 1, 2024
ARR commitment date: September 22, 2024
Notification of acceptance: September 24, 2024
Camera-ready deadline: October 4, 2024
Unifying the Scope of Bridging Anaphora Types in English: Bridging Annotations in ARRAU and GUM (Lauren Levine and Amir Zeldes) – Best Paper
DeepHCoref: A Deep Neural Coreference Resolution for Hindi Text (Kusum Lata, Kamlesh Dutta, Pardeep Singh and Abhishek Kanwar)
WinoPron: Revisiting English Winogender Schemas for Consistency, Coverage, and Grammatical Case (Vagrant Gautam, Julius Steuer, Eileen Bingert, Ray Johns, Anne Lauscher and Dietrich Klakow)
Polish Coreference Corpus as an LLM Testbed: Evaluating Coreference Resolution within Instruction-Following Language Models by Instruction-Answer Alignment (Karol Saputa, Angelika Peljak-Łapińska and Maciej Ogrodniczuk)
Enriching Conceptual Knowledge in Language Models through Metaphorical Reference Explanation (Zixuan Zhang and Heng Ji)
MSCAW-coref: Multilingual, Singleton and Conjunction-Aware Word-Level Coreference Resolution (Houjun Liu, John Bauer, Karel D'Oosterlinck, Christopher Potts and Christopher D. Manning)
Findings of the Third Shared Task on Multilingual Coreference Resolution (Michal Novák, Barbora Dohnalová, Miloslav Konopik, Anna Nedoluzhko, Martin Popel, Ondrej Prazak, Jakub Sido, Milan Straka, Zdeněk Žabokrtský and Daniel Zeman)
CorPipe at CRAC 2024: Predicting Zero Mentions from Raw Text (Milan Straka)
End-to-end Multilingual Coreference Resolution with Headword Mention Representation (Ondrej Prazak and Miloslav Konopík)
Multilingual coreference resolution as text generation (Natalia Skachkova)
Natural language is traditionally framed as a mapping from form to content, with reference being the connection between the two. Yet curiously, large language models have achieved impressive levels of performance and adoption through training on distributional signals, which concerns form alone. In this talk, I argue for the importance of reference and coreference in NLP, and discuss topics in NLP which are touched by these phenomena, including model "hallucinations" and factual errors, knowledge updating, common sense reasoning, and conversational agents. I discuss how existing evaluation practices based on large-scale benchmarking often masks the importance of reference-related phenomena, and present work from my lab that reflects on current evaluation practices and their validity. I call for more serious consideration of reference including targeted evaluation of reference-related phenomena as a necessary step towards achieving robust NLP systems.
Jackie Chi Kit Cheung is an associate professor at McGill University's School of Computer Science, where he co-directs the Reasoning and Learning Lab. He is a Canada CIFAR AI Chair and an Associate Scientific Co-Director at the Mila Quebec AI Institute. His research focuses on topics in natural language generation such as automatic summarization, and on integrating diverse knowledge sources into NLP systems for pragmatic and common-sense reasoning. He also works on applications of NLP to domains such as education, health, and language revitalization. He is motivated by how the structure of the world can be reflected in the structure of language processing systems. He is a consulting researcher at Microsoft Research Montreal.
9:00 – 9:15: Opening and welcome (Vincent Ng and Maciej Ogrodniczuk)
9:15 – 10:30: Reference at the Heart of Natural Language Processing (Jackie Chi Kit Cheung)
11:00 – 11:20: Challenges to Evaluating the Generalization of Coreference Resolution Models: A Measurement Modeling Perspective (Ian Porada, Alexandra Olteanu, Kaheer Suleman, Adam Trischler and Jackie Chi Kit Cheung)
11:20 – 11:40: “Any Other Thoughts, Hedgehog?” Linking Deliberation Chains in Collaborative Dialogues (Abhijnan Nath, Videep Venkatesha, Mariah Bradford, Avyakta Chelle, Austin Collin Youngren, Carlos Mabrey, Nathaniel Blanchard and Nikhil Krishnaswamy)
11:40 – 11:50: MMAR: Multilingual and Multimodal Anaphora Resolution in Instructional Videos (Cennet Oguz, Pascal Denis, Simon Ostermann, Emmanuel Vincent, Natalia Skachkova and Josef van Genabith)
11:50 – 12:10: Major Entity Identification: A Generalizable Alternative to Coreference Resolution (Kawshik S. Manikantan, Shubham Toshniwal, Makarand Tapaswi and Vineet Gandhi)
13:50 – 14:00: Enriching Conceptual Knowledge in Language Models through Metaphorical Reference Explanation (Zixuan Zhang and Heng Ji)
14:00 – 14:10: Polish Coreference Corpus as an LLM Testbed: Evaluating Coreference Resolution within Instruction-Following Language Models by Instruction-Answer Alignment (Karol Saputa, Angelika Peljak-Łapińska and Maciej Ogrodniczuk)
14:10 – 14:30: MSCAW-coref: Multilingual, Singleton and Conjunction-Aware Word-Level Coreference Resolution (Houjun Liu, John Bauer, Karel D'Oosterlinck, Christopher Potts and Christopher D. Manning)
14:30 – 14:50: Unifying the Scope of Bridging Anaphora Types in English: Bridging Annotations in ARRAU and GUM (Lauren Levine and Amir Zeldes) – Best Paper
14:50 – 15:10: Revisiting English Winogender Schemas for Consistency, Coverage, and Grammatical Case (Vagrant Gautam, Julius Steuer, Eileen Bingert, Ray Johns, Anne Lauscher and Dietrich Klakow)
15:10 – 15:30: DeepHCoref: A Deep Neural Coreference Resolution for Hindi Text (Kusum Lata, Kamlesh Dutta, Pardeep Singh and Abhishek Kanwar)
16:00 – 16:30: Findings of the Third Shared Task on Multilingual Coreference Resolution (Michal Novák, Barbora Dohnalová, Miloslav Konopík, Anna Nedoluzhko, Martin Popel, Ondřej Pražák, Jakub Sido, Milan Straka, Zdeněk Žabokrtský and Daniel Zeman)
16:30 – 16:50: CorPipe at CRAC 2024: Predicting Zero Mentions from Raw Text (Milan Straka)
16:50 – 17:10: End-to-end Multilingual Coreference Resolution with Headword Mention Representation (Ondřej Pražák and Miloslav Konopík)
17:10 – 17:20: Multilingual coreference resolution as text generation (Natalia Skachkova)
17:20 – 17:50: The future of coreference resolution in the era of LLMs (Michal Novák, Ondřej Pražák and Martin Popel)
17:50 – 18:00: Closing of the workshop (Maciej Ogrodniczuk)
Arie Cattan (Bar-Ilan University)
Sobha Lalitha Devi (AU-KBC Research Center, Anna University of Chennai)
Elisa Ferracane (Abridge)
Yulia Grishina (Amazon)
Christian Hardmeier (IT University of Copenhagen)
Lars Hellan (Norwegian University of Science and Technology)
Veronique Hoste (Ghent University)
Ekaterina Lapshinova-Koltunski (Saarland University)
Sharid Loáiciga (University of Gothenburg)
Costanza Navaretta (University of Copenhagen)
Michal Novák (Charles University in Prague)
Massimo Poesio (Queen Mary University of London)
Ian Porada (McGill University, Mila – Quebec AI Institute)
Carolyn Rosé (Carnegie Mellon University)
Juntao Yu (University of Essex)
Amir Zeldes (Georgetown University)
Yilun Zhu (Georgetown University)
Maciej Ogrodniczuk (Institute of Computer Science, Polish Academy of Sciences)
Massimo Poesio (Queen Mary University of London)
Sameer Pradhan (University of Pennsylvania and cemantix)
Anna Nedoluzhko (Charles University in Prague)
Vincent Ng (University of Texas at Dallas)