Call for Participation

Call for Participation

The fourth edition of the MEDIQA shared tasks includes three tasks on Multimodal Medical Answer Generation & Medical Error Correction, organized at CLEF & NAACL-ClinicalNLP 2024.  

1)  Multimodal & Multilingual Medical Answer Generation 

The rapid development of telecommunication technologies, the increased demands for healthcare services, and recent pandemic needs, have accelerated the adoption of remote clinical diagnosis and treatment. In addition to live meetings with doctors which may be conducted through telephone or video, asynchronous options such as e-visits, emails, and messaging chats have also been proven to be cost-effective and convenient. We focus on the problem of clinical dermatology multimodal query response generation. Consumer health question answering has been the subject of past challenges and research; however, these prior works only focus on text. Previous work on visual question answering have focused mainly on radiology images and did not include additional clinical text input. Also, while there is much work on dermatology image classification, much prior work is related to lesion malignancy classification for dermatoscope images. To the best of our knowledge, this is the first challenge and study of a problem that seeks to automatically generate clinical responses, given textual clinical history, as well as user generated images and queries.

MEDIQA-MAGIC: Multimodal & Generative Telemedicine in Dermatology @ CLEF 2024, September 2024, Grenoble, France

MEDIQA-M3G: Multilingual & Multimodal Medical Answer Generation @ NAACL-ClinicalNLP, June 2024, Mexico City, Mexico 

2) Medical Error Detection & Correction 

Large language models (LLMs) show promise in being applied on unseen tasks with competitive ability. However, by construction, such models have a key vulnerability; their ability is only as good as its underlying training data. Since LLMs rely on large corpora of textual data (often from the world wide web) for training, their data is almost impossible to manually curate at scale. If the data contains false information or only one perspective or type of information, the ability of LLMs to discern factual information may be hindered. Also, as a consequence to their own success, some online content may be entirely generated by LLMs that are prone to hallucinated information. In addition, in specialized domains, online information can be unreliable, harmful, and contain logical inconsistencies that may hinder the models' reasoning ability. However, most previous works on common sense detection have focused on the general domain. In this task, we seek to address the problem of identifying and correcting (common sense) medical errors in clinical notes. From a human perspective, these errors require medical expertise and knowledge to be both identified and corrected. 

MEDIQA-CORR: Medical Error Detection & Correction @ NAACL-ClinicalNLP, June 2024, Mexico City, Mexico 

