CRAC 2023

Sixth Workshop on Computational Models of Reference, Anaphora and Coreference

CRAC 2023, the Sixth Workshop on Computational Models of Reference, Anaphora and Coreference, was held at EMNLP 2023 in Singapore (December 6–7).

Topics

The workshop welcomes submissions describing theoretical and applied computational work on anaphora/coreference resolution. Topics of interest include but are not limited to:

About the workshop series

Background: Since 2016, the yearly CRAC (and its predecessor, CORBON) workshop has become the primary forum for researchers interested in the computational modeling of reference, anaphora, and coreference to discuss and publish their results. Over the years, this workshop series has successfully organized five shared tasks, which stimulated interest in new problems in this area of research, facilitated the discussion and dissemination of results on new problems/directions (e.g., multimodal reference resolution), and helped expand the coreference community that used to be dominated by European researchers to include young researchers from the Americas.

Objectives: The aim of the workshop is to provide a forum where work on all aspects of computational work on anaphora resolution and annotation, including both coreference and types of anaphora such as bridging references resolution and discourse deixis, can be presented.

Previous editions: CRAC started as CORBON 2016, co-located with NAACL and in 2017 with EACL. In 2018 the focus of the workshop was broadened to cover all cases of computational modelling of reference, anaphora, and coreference and it was renamed to CRAC. CRAC 2018 and 2019 were held at NAACL, CRAC 2020 at COLING, CRAC 2021 at EMNLP and CRAC 2022 again at COLING.

Our workshop in ACL Anthology: Please take a look at the proceedings of CORBON and CRAC in ACL Anthology.

CRAC 2023 Shared Task on Multilingual Coreference Resolution

CRAC 2023 also featured presentation of the results of the Shared Task on Multilingual Coreference Resolution and an invited talk by Milan Straka on Recent Computational Approaches to Coreference Resolution.

Shared Task papers

Important dates

Accepted papers

Long papers

Short papers

Demo papers and Extended abstracts

Invited Talks

Bernd Bohnet: Multilingual Coreference Resolution with Innovative seq2seq Models

In this talk, we explore advancements in coreference resolution systems, focusing on our novel approach that leverages a text-to-text (seq2seq) paradigm of modern LLMs. We utilize multilingual T5 (mT5) as the foundational language model. Traditional coreference systems primarily employ search algorithms across possible spans. In contrast, our method jointly predicts mentions and links, achieving superior accuracy on the CoNLL-2012 datasets. Notably, our system recorded an 83.3 F1-score for English, surpassing previous research. Further evaluations on multilingual datasets, particularly Arabic and Chinese, yielded improvements over prior works, showcasing the multilingual transfer abilities of our model across many languages. Additionally, our experiments with the SemEval-2010 datasets in various settings—including zero-shot and low resource transfer—reveal significant performance improvements for other languages. We will discuss the capabilities of LLMs to provide a more streamlined, effective, and unified approach to coreference resolution.

Bernd Bohnet is a researcher in Natural Language Processing (NLP). He earned his Ph.D. with a specialization in text generation. Subsequently, he served as an tenured Assistant Professor at the University of Birmingham. For the past nine years, Dr. Bohnet carried out research with Google and Google DeepMind. His expertise encompasses a broad range of topics in natural language understanding, including tagging, parsing, coreference resolution, and reading comprehension. In recent years, he has turned his attention to Large Language Models (LLMs), focusing on their capabilities in factual accuracy, question answering, and the integration techniques into LLMs.

Milan Straka: Recent Computational Approaches to Coreference Resolution

In a manner consistent with development in various domains of natural language processing, the performance of coreference resolution systems has been exhibiting a consistent improvement over recent years. With coreference resolution being a complex structured prediction problem, quite a few approaches have been put forth, encompassing auto-/non-autoregressive decoding, diverse mention representation, and pretrained language models of varying size and kind. In this talk, I seek to offer a review of prominent approaches and assess and compare them with a high degree of independence. Furthermore, owing to the CorefUD initiative providing datasets in many languages, I aim to empirically quantify the impact of multilingual and crosslingual transfer on the performance of the best system of the CRAC 2023 Shared Task on Multilingual Coreference Resolution.

Milan Straka is an assistant professor at the Institute of Formal and Applied Linguistics at the Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic. He is the (co-)author of several shared-task-winning NLP tools like UDPipe, a morphosyntactic analyzer for currently 72 languages; PERIN, a semantic parser; and CorPipe, the winner of CRAC 2022 and 2023 shared tasks on multilingual coreference resolution. His further research interests include named entity recognition, named entity linking, grammar error correction, and multilingual models in general.

Workshop schedule

December 6: CRAC 2023

Opening remarks

Invited talk

Coffee break

Paper session 1

Lunch break

Paper session 2

Findings paper session

Coffee break

Panel on Universal Anaphora

December 7: CRAC 2023 Shared Task on Multilingual Coreference Resolution 

Invited talk

Overview paper talk

Coffee break

Shared task system demonstration session

Surprise presentation

Closing remarks

Program Committee

Organizing Committee