NumEval: Numeral-Aware Language Understanding and Generation

SemEval-2024 Task 7

 June 20–21, 2024 (co-located with NAACL 2024 in Mexico City, Mexico)

Introduction

In previous SemEval competitions, the majority of tasks have primarily focused on analyzing words within a text, with scant consideration given to numerical data. However, we've observed that comprehension of numerical values can significantly enhance performance in certain tasks.  

Consider, for instance, a scenario where one anticipates a 30% rise in stock prices versus a 3% rise. This nuance plays a pivotal role in fine-grained sentiment analysis (SemEval-2017 Task 5), as the former implies a stronger sentiment than the latter. Similarly, in a legal context such as SemEval-2023 Task 6, the statement "Stealing 10 dollars" compared to "Stealing 100,000 dollars" could potentially lead to differing court judgments. Additionally, in the context of clinical inference (SemEval-2023 Task 7), a patient's systolic blood pressure reading of 119 versus 121 could convey contrasting implications. These examples underscore the significance of understanding numerical data in text and hint at a potential research direction that could improve performance in downstream tasks.

Lately, the attention towards numbers in textual data and models' numeracy has been growing within the NLP community. We believe that the timing is ripe and indeed necessary to establish a testbed for evaluating the performance of current high-performing models in numerically-aware language comprehension and generation.

Important Dates

All deadlines are 23:59 UTC-12 ("anywhere on Earth").