MEDIQA 2019
Textual Inference and Question Entailment in the Medical Domain
MEDIQA is a series of shared tasks on Medical NLP & AI.
Introduction
The MEDIQA 2019 challenge aims to attract further research efforts in Natural Language Inference (NLI), Recognizing Question Entailment (RQE), and their applications in medical Question Answering (QA). This ACL-BioNLP 2019 shared task is motivated by a need to develop relevant methods, techniques and gold standards for inference and entailment in the medical domain and their application to improve domain specific IR and QA systems (Overview Paper).
Join our mailing list: https://groups.google.com/g/mediqa-nlp
News
December 2020: New edition of the MEDIQA shared task at NAACL-BioNLP'21
January 2020: A post-challenge round was released to allow submitting new runs on AICrowd https://www.aicrowd.com/organizers/mediqa-acl-bionlp
August 2019: Overview paper of the MEDIQA 2019 Shared Task https://www.aclweb.org/anthology/W19-5039
Tasks
1) NLI: This first task consists in identifying three inference relations between two sentences: Entailment, Neutral and Contradiction [1]
2) RQE: This task focuses on identifying entailment between two questions in the context of QA. We use the following definition of question entailment: "a question A entails a question B if every answer to B is also a complete or partial answer to A" [2]
3) QA: The objective of this task is to filter and improve the ranking of automatically retrieved answers. The input ranks are generated by the medical QA system CHiQA. We highly recommend the reuse of RQE and/or NLI systems (first tasks) in the QA task [3-5]
Organizers
Asma Ben Abacha, NLM/NIH
Chaitanya Shivade, IBM
Dina Demner-Fushman, NLM/NIH
Important Dates
February 8, 2019: AICrowd projects go public: NLI@AICrowd, RQE@AICrowd & QA@AICrowd.
February 28, 2019: Release of the RQE validation set, run submission open.
March 19, 2019: Release of the QA validation set.
April 10, 2019: Run submission open on the QA validation set.
April 15, 2019: Release of the test sets.
April 30, 2019: Run submission deadline. Participants' results will be available on AIcrowd.
May 15, 2019: Paper submission deadline. Submission instructions
May 31, 2019: Notification of acceptance.
June 6, 2019: Camera-ready copy due --Firm deadline due to ACL schedule.
August 1, 2019: BioNLP workshop, ACL 2019, Florence, Italy.
Data & Evaluation
** All datasets and evaluation scripts are available at : https://github.com/abachaa/MEDIQA2019 [6]
Training sets:
NLI: The MedNLI dataset including 14,049 clinical sentence pairs [1]. Important: Participants will have to obtain access to MIMIC in order to access MedNLI and the test set.
RQE: The RQE collection containing 8,588 medical question pairs [2].
QA: Two sets of medical questions and the associated lists of answers retrieved by the medical QA system CHiQA and reranked manually: https://github.com/abachaa/MEDIQA2019/tree/master/MEDIQA_Task3_QA
In addition, the MedQuAD dataset of 47k question-answer pairs can be used to retrieve answered questions that are entailed from the original questions [3].
Validation and test sets:
QA Datasets + Submission on QA@AICrowd.
Evaluation measures: Accuracy for the NLI and RQE tasks. For the QA task: Mean Reciprocal Rank (MRR), Accuracy, Precision, and Spearman's Rank Correlation Coefficient.
Results:
72 teams submitted runs to one or more tasks: NLI-Leaderboard, RQE-Leaderboard, and QA-Leaderboard.
References
[1] A. Romanov & C. Shivade. Lessons from Natural Language Inference in the Clinical Domain. EMNLP 2018. DATA
[2] A. Ben Abacha & D. Demner-Fushman. Recognizing Question Entailment for Medical Question Answering. AMIA 2016. DATA
[3] A. Ben Abacha & D. Demner-Fushman. A Question-Entailment Approach to Question Answering. arXiv:1901.08079 [cs.CL], January 2019. DATA
[4] S. Harabagiu & A. Hickl. Methods for using textual entailment in open-domain question answering. ACL 2006.
[5] A. Ben Abacha, E. Agichtein, Y. Pinter & D. Demner-Fushman. Overview of the Medical Question Answering Task at TREC 2017 LiveQA. TREC 2017. DATA
[6] A. Ben Abacha, C. Shivade, and D. Demner-Fushman. Overview of the MEDIQA 2019 Shared Task on Textual Inference, Question Entailment and Question Answering. ACL-BioNLP 2019.