MEDIQA 2021

Summarization in the Medical Domain 

MEDIQA is a series of shared tasks on Medical NLP. Previous edition:  MEDIQA 2019

Introduction

MEDIQA 2021 tackles three summarization tasks in the medical domain: consumer health question summarization, multi-answer summarization, and radiology report summarization. In this shared task, we will also explore the use of different evaluation metrics for summarization.    

MEDIQA 2021 will be organized at the NAACL-BioNLP 2021 workshop. 

Join our mailing list: https://groups.google.com/g/mediqa-nlp

News

Tasks 

1) Summarization of Consumer Health Questions

Consumer health questions tend to contain numerous peripheral information that hinders automatic Question Answering (QA). Empirical QA studies based on manual expert summarization of these questions showed a substantial improvement of 58% in performance [1]. Effective automatic summarization methods for consumer health questions could therefore play a key role in enhancing medical question answering. 

The goal of this task is to promote the development of new summarization approaches that address specifically the challenges of long and potentially complex consumer health questions.  

Relevant approaches should be able to generate a condensed question expressing the minimum information required to find correct answers to the original question [2].  

2) Summarization of Multiple Answers

Different answers can bring complementary perspectives that are likely to benefit the users of QA systems. The goal of this task is to promote the development of multi-answer summarization approaches that could solve simultaneously the aggregation and summarization problems posed by multiple relevant answers to a medical question [4].

3) Summarization of Radiology Reports 

The automatic summarization of radiology reports has several clinical applications such as accelerating the radiology workflow and improving the efficiency of clinical communications. 

This task aims to promote the development of clinical summarization models that are able to generate radiology impression statements by summarizing textual findings written by radiologists [7-8].  

Datasets 

Task 1: Question Summarization


Task 2: Multi-Answer Summarization


Task 3: Radiology Report Summarization

Registration 

The registration & data usage agreement form is available under the Resources section of the AIcrowd projects.

The form covers the three tasks. You can download it from any of the three MEDIQA projects: QS@AIcrowd, MAS@AIcrowd & RRS@AIcrowd.  

To register, you need to complete, sign, and upload the form. When approved, you will be able to download the official test sets and to submit your runs on the AIcrowd submission systems


Submission & Evaluation

The AIcrowd platform will be used for releasing the test sets and submitting runs: https://www.aicrowd.com/challenges/mediqa-2021

Submission Format for the three tasks:  

Evaluation Metrics: 

ROUGE [9] will be used as the main metric to rank the participating teams [10], but we will also use  several evaluation metrics more adapted to each task such as HOLMS [11] and CheXbert [12]. 


Official Results

MEDIQA 2021: Official Results

Organizers


Important Dates 

References   

[1] "On the Role of Question Summarization and Information Source Restriction in Consumer Health Question Answering". Asma Ben Abacha & Dina Demner-Fushman. AMIA 2019 Informatics Summit

[2] "On the Summarization of Consumer Health Questions". Asma Ben Abacha & Dina Demner-Fushman. ACL 2019. MeQSum Dataset

[3] "Semantic Annotation of Consumer Health Questions". Halil Kilicoglu, Asma Ben Abacha, Yassine Mrabet, Sonya E Shooshan, Laritza Rodriguez, Kate Masterton & Dina Demner-Fushman. BMC Bioinformatics, 2018. CHQs Dataset 

[4]  "Question-Driven Summarization of Answers to Consumer Health Questions". Max E. Savery, Asma Ben Abacha, Soumya Gayen & Dina Demner-Fushman. Scientific Data, Nature, 2020. MEDIQA-AnS Dataset

[5]  "Consumer health information and question answering: helping consumers find answers to their health-related information needs". Dina Demner-Fushman, Yassine Mrabet & Asma Ben Abacha. JAMIA 2020. 

[6] "A Question-Entailment Approach to Question Answering". Asma Ben Abacha & Dina Demner-Fushman. BMC Bioinformatics, 2019. MedQuAD Dataset

[7]  "Learning to Summarize Radiology Findings". Yuhao Zhang, Daisy Yi Ding, Tianpei Qian, Christopher D. Manning & Curtis P. Langlotz. EMNLP-LOUHI 2020.  

[8]  "Optimizing the Factual Correctness of a Summary: A Study of Summarizing Radiology Reports". Yuhao Zhang, Derek Merck, Emily Bao Tsai, Christopher D. Manning & Curtis P. Langlotz. ACL 2020.   

[9]  "ROUGE: A Package for Automatic Evaluation of Summaries". Chin-Yew Lin. ACL 2004. 

[10]  "Re-evaluating Evaluation in Text Summarization". Manik Bhandari, Pranav Gour, Atabak Ashfaq, Pengfei Liu & Graham Neubig. EMNLP 2020. 

[11] " HOLMS: Alternative Summary Evaluation with Large Language Models". Yassine Mrabet & Dina Demner-Fushman. COLING 2020. 

[12]  "CheXbert: Combining Automatic Labelers and Expert Annotations for Accurate Radiology Report Labeling Using BERT". Akshay Smit, Saahil Jain, Pranav Rajpurkar, Anuj Pareek, Andrew Y. Ng & Matthew P. Lungren. EMNLP 2020. 

[13]  "MIMIC-CXR Database (version 2.0.0)". Johnson, A., Pollard, T., Mark, R., Berkowitz, S., & Horng, S. PhysioNet. 2019. 

[14] "MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports". Johnson, A.E.W., Pollard, T.J., Berkowitz, S.J. et al. Sci Data 6, 317. 2019.