Task 7: Multi-evidence Natural Language Inference for Clinical Trial Data (NLI4CT)

at SemEval 2023

Codalab competition

The Dataset is released! Please visit the Codalab competition website to participate in the task.

See the SemEval 2023 Task 7 overview paper
See the Dataset paper

Motivation

In recent years, there has been a significant increase in the publications of Clinical Trial Reports (CTRs). Currently, there exists in excess of 10,000 CTRs for Breast Cancer alone. Over time it has become infeasible for clinical practitioners to stay updated on all current literature in order to provide personalized evidence-based care (DeYoung et al., 2020). In this context, Natural Language Inference (NLI) brings an opportunity to support the large-scale interpretation and retrieval of medical evidence. Successful development could significantly enhance the way we connect the latest evidence to support personalised care (Sutton et al.,2020).

Task Overview

Multi-evidence Natural Language Inference for Clinical Trial Data (NLI4CT)

This task is based on a collection of breast cancer CTRs (extracted from https://clinicaltrials.gov/ct2/home), statements, explanations, and labels annotated by domain expert annotators. It consists of 2 sub-tasks. Participants can select one or more tasks depending on their preference.

Task 1: Textual Entailment

For the purpose of the task, we have summarised the collected CTRs into 4 sections:

Eligibility criteria - A set of conditions for patients to be allowed to take part in the clinical trial
Intervention - Information concerning the type, dosage, frequency, and duration of treatments being studied.
Results - Number of participants in the trial, outcome measures, units, and the results.
Adverse events - These are signs and symptoms observed in patients during the clinical trial.

The annotated statements are sentences with an average length of 19.5 tokens, that make some type of claim about the information contained in one of the sections in the CTR premise. The statements may make claims about a single CTR or compare 2 CTRs. Task 1 is to determine the inference relation (entailment vs contradiction) between CTR - statement pairs.

Task 2: Evidence retrieval

Given a CTR premise, and a statement, output a set of supporting facts, extracted from the premise, necessary to justify the label predicted in Task 1.

Organisers

Mael Jullien - University of Manchester
Marco Valentino - University of Manchester, Idiap Research Institute
Hannah Frost - Digital ECMT, University of Manchester
Paul O'Regan - Digital ECMT, University of Manchester
Donal Landers - Digital ECMT, University of Manchester
Andre Freitas - University of Manchester, Idiap Research Institute

Contacts

Please join our Google group for direct communication at nli4ct@googlegroups.com

For any questions, feel free to contact us at nli4clinicaltrials@gmail.com

References

Reed T Sutton, David Pincock, Daniel C Baumgart, Daniel C Sadowski, Richard N Fedorak, and Karen I Kroeker. 2020. An overview of clinical decision support systems: benefits, risks, and strategies for success. NPJ digital medicine, 3(1):1–10.
Abhilasha Ravichander, Aakanksha Naik, Carolyn Penstein Rosé, and Eduard H. Hovy. 2019. EQUATE: A benchmark evaluation framework for quantitative reasoning in natural language inference. CoRR, abs/1901.03735.
Alexandre Galashov, Jonathan Schwarz, Hyunjik Kim, Marta Garnelo, David Saxton, Pushmeet Kohli, S. M. Ali Eslami, and Yee Whye Teh. 2019. Meta-learning surrogate models for sequential decision making. CoRR, abs/1903.11907.
Jinhyuk Lee, Wonjin Yoon, Sungdong Kim, Donghyeon Kim, Sunkyu Kim, Chan Ho So, and Jaewoo Kang. 2019. Biobert: a pre-trained biomedical language representation model for biomedical text mining. CoRR, abs/1901.08746.
Jay DeYoung, Eric P. Lehman, Benjamin E. Nye, Iain James Marshall, and Byron C. Wallace. 2020. Evidence inference 2.0: More data, better models. ArXiv, abs/2005.04177.