CRAC 2024
7th Workshop on Computational Models of Reference, Anaphora and Coreference
CRAC 2024, the Seventh Workshop on Computational Models of Reference, Anaphora and Coreference, was held at EMNLP 2024 (in Miami, Florida, on November 15).
About the workshop series
Since 2016, the yearly CRAC (and its predecessor, CORBON) workshop has become the primary forum for researchers interested in the computational modeling of reference, anaphora, and coreference to discuss and publish their results. Over the years, this workshop series has successfully organized five shared tasks, which stimulated interest in new problems in this area of research, facilitated the discussion and dissemination of results on new problems/directions (e.g., multimodal reference resolution), and helped expand the coreference community that used to be dominated by European researchers to include young researchers from the Americas.
The aim of the workshop is to provide a forum where work on all aspects of computational work on anaphora resolution and annotation can be presented.
Topics
The workshop welcomes submissions describing theoretical and applied computational work on anaphora/coreference resolution. Topics of interest include but are not limited to:
coreference resolution for less-researched languages
annotation and interpretation of anaphoric relations, including relations other than identity coreference (e.g., bridging references)
investigation of difficult cases of anaphora and their resolution
coreference resolution in noisy data (e.g. in social media)
new applications of coreference resolution
CRAC 2024 Shared Task on Multilingual Coreference Resolution
CRAC 2024 also featured presentation of the results of the Shared Task on Multilingual Coreference Resolution. Please find the list of accepted Shared Task papers and the program of the Shared Task session below.
Important Dates
Shared Task submission deadline: June 24, 2024
Shared Task evaluation results: after June 24, 2024
Regular paper submission deadline: September 1, 2024
Shared Task system description submission deadline: September 1, 2024
ARR commitment date: September 22, 2024
Notification of acceptance: September 24, 2024
Camera-ready deadline: October 4, 2024
Accepted Papers
Long papers
Unifying the Scope of Bridging Anaphora Types in English: Bridging Annotations in ARRAU and GUM (Lauren Levine and Amir Zeldes) – Best Paper
DeepHCoref: A Deep Neural Coreference Resolution for Hindi Text (Kusum Lata, Kamlesh Dutta, Pardeep Singh and Abhishek Kanwar)
WinoPron: Revisiting English Winogender Schemas for Consistency, Coverage, and Grammatical Case (Vagrant Gautam, Julius Steuer, Eileen Bingert, Ray Johns, Anne Lauscher and Dietrich Klakow)
Short papers
Polish Coreference Corpus as an LLM Testbed: Evaluating Coreference Resolution within Instruction-Following Language Models by Instruction-Answer Alignment (Karol Saputa, Angelika Peljak-Łapińska and Maciej Ogrodniczuk)
Enriching Conceptual Knowledge in Language Models through Metaphorical Reference Explanation (Zixuan Zhang and Heng Ji)
MSCAW-coref: Multilingual, Singleton and Conjunction-Aware Word-Level Coreference Resolution (Houjun Liu, John Bauer, Karel D'Oosterlinck, Christopher Potts and Christopher D. Manning)
Shared Task papers
Findings of the Third Shared Task on Multilingual Coreference Resolution (Michal Novák, Barbora Dohnalová, Miloslav Konopik, Anna Nedoluzhko, Martin Popel, Ondrej Prazak, Jakub Sido, Milan Straka, Zdeněk Žabokrtský and Daniel Zeman)
CorPipe at CRAC 2024: Predicting Zero Mentions from Raw Text (Milan Straka)
End-to-end Multilingual Coreference Resolution with Headword Mention Representation (Ondrej Prazak and Miloslav Konopík)
Multilingual coreference resolution as text generation (Natalia Skachkova)
Invited Talk
Jackie Chi Kit Cheung: Reference at the Heart of Natural Language Processing
Natural language is traditionally framed as a mapping from form to content, with reference being the connection between the two. Yet curiously, large language models have achieved impressive levels of performance and adoption through training on distributional signals, which concerns form alone. In this talk, I argue for the importance of reference and coreference in NLP, and discuss topics in NLP which are touched by these phenomena, including model "hallucinations" and factual errors, knowledge updating, common sense reasoning, and conversational agents. I discuss how existing evaluation practices based on large-scale benchmarking often masks the importance of reference-related phenomena, and present work from my lab that reflects on current evaluation practices and their validity. I call for more serious consideration of reference including targeted evaluation of reference-related phenomena as a necessary step towards achieving robust NLP systems.
Jackie Chi Kit Cheung is an associate professor at McGill University's School of Computer Science, where he co-directs the Reasoning and Learning Lab. He is a Canada CIFAR AI Chair and an Associate Scientific Co-Director at the Mila Quebec AI Institute. His research focuses on topics in natural language generation such as automatic summarization, and on integrating diverse knowledge sources into NLP systems for pragmatic and common-sense reasoning. He also works on applications of NLP to domains such as education, health, and language revitalization. He is motivated by how the structure of the world can be reflected in the structure of language processing systems. He is a consulting researcher at Microsoft Research Montreal.
Workshop schedule (November 15)
Opening remarks
9:00 – 9:15: Opening and welcome (Vincent Ng and Maciej Ogrodniczuk)
Invited talk
9:15 – 10:30: Reference at the Heart of Natural Language Processing (Jackie Chi Kit Cheung)
Findings paper session
11:00 – 11:20: Challenges to Evaluating the Generalization of Coreference Resolution Models: A Measurement Modeling Perspective (Ian Porada, Alexandra Olteanu, Kaheer Suleman, Adam Trischler and Jackie Chi Kit Cheung)
11:20 – 11:40: “Any Other Thoughts, Hedgehog?” Linking Deliberation Chains in Collaborative Dialogues (Abhijnan Nath, Videep Venkatesha, Mariah Bradford, Avyakta Chelle, Austin Collin Youngren, Carlos Mabrey, Nathaniel Blanchard and Nikhil Krishnaswamy)
11:40 – 11:50: MMAR: Multilingual and Multimodal Anaphora Resolution in Instructional Videos (Cennet Oguz, Pascal Denis, Simon Ostermann, Emmanuel Vincent, Natalia Skachkova and Josef van Genabith)
EMNLP 2024 paper
11:50 – 12:10: Major Entity Identification: A Generalizable Alternative to Coreference Resolution (Kawshik S. Manikantan, Shubham Toshniwal, Makarand Tapaswi and Vineet Gandhi)
Research paper session
13:50 – 14:00: Enriching Conceptual Knowledge in Language Models through Metaphorical Reference Explanation (Zixuan Zhang and Heng Ji)
14:00 – 14:10: Polish Coreference Corpus as an LLM Testbed: Evaluating Coreference Resolution within Instruction-Following Language Models by Instruction-Answer Alignment (Karol Saputa, Angelika Peljak-Łapińska and Maciej Ogrodniczuk)
14:10 – 14:30: MSCAW-coref: Multilingual, Singleton and Conjunction-Aware Word-Level Coreference Resolution (Houjun Liu, John Bauer, Karel D'Oosterlinck, Christopher Potts and Christopher D. Manning)
14:30 – 14:50: Unifying the Scope of Bridging Anaphora Types in English: Bridging Annotations in ARRAU and GUM (Lauren Levine and Amir Zeldes) – Best Paper
14:50 – 15:10: Revisiting English Winogender Schemas for Consistency, Coverage, and Grammatical Case (Vagrant Gautam, Julius Steuer, Eileen Bingert, Ray Johns, Anne Lauscher and Dietrich Klakow)
15:10 – 15:30: DeepHCoref: A Deep Neural Coreference Resolution for Hindi Text (Kusum Lata, Kamlesh Dutta, Pardeep Singh and Abhishek Kanwar)
Shared task paper session
16:00 – 16:30: Findings of the Third Shared Task on Multilingual Coreference Resolution (Michal Novák, Barbora Dohnalová, Miloslav Konopík, Anna Nedoluzhko, Martin Popel, Ondřej Pražák, Jakub Sido, Milan Straka, Zdeněk Žabokrtský and Daniel Zeman)
16:30 – 16:50: CorPipe at CRAC 2024: Predicting Zero Mentions from Raw Text (Milan Straka)
16:50 – 17:10: End-to-end Multilingual Coreference Resolution with Headword Mention Representation (Ondřej Pražák and Miloslav Konopík)
17:10 – 17:20: Multilingual coreference resolution as text generation (Natalia Skachkova)
Panel discussion
17:20 – 17:50: The future of coreference resolution in the era of LLMs (Michal Novák, Ondřej Pražák and Martin Popel)
Closing remarks
17:50 – 18:00: Closing of the workshop (Maciej Ogrodniczuk)
Program Committee
Arie Cattan (Bar-Ilan University)
Sobha Lalitha Devi (AU-KBC Research Center, Anna University of Chennai)
Elisa Ferracane (Abridge)
Yulia Grishina (Amazon)
Christian Hardmeier (IT University of Copenhagen)
Lars Hellan (Norwegian University of Science and Technology)
Veronique Hoste (Ghent University)
Ekaterina Lapshinova-Koltunski (Saarland University)
Sharid Loáiciga (University of Gothenburg)
Costanza Navaretta (University of Copenhagen)
Michal Novák (Charles University in Prague)
Massimo Poesio (Queen Mary University of London)
Ian Porada (McGill University, Mila – Quebec AI Institute)
Carolyn Rosé (Carnegie Mellon University)
Juntao Yu (University of Essex)
Amir Zeldes (Georgetown University)
Yilun Zhu (Georgetown University)
Organizing Committee
Maciej Ogrodniczuk (Institute of Computer Science, Polish Academy of Sciences)
Massimo Poesio (Queen Mary University of London)
Sameer Pradhan (University of Pennsylvania and cemantix)
Anna Nedoluzhko (Charles University in Prague)
Vincent Ng (University of Texas at Dallas)