MEDIQA-CORR@ ClinicalNLP 2024

MEDIQA-CORR @ NAACL-ClinicalNLP 2024

Medical Error Detection & Correction

Motivation


Large language models (LLMs) show promise in being applied on unseen tasks with competitive ability.

However, by construction, such models have a key vulnerability; their ability is only as good as its underlying training data. Since LLMs rely on large corpora of textual data (often from the world wide web) for training, their data is almost impossible to manually curate at scale. If the data contains false information or only one perspective or type of information, the ability of LLMs to discern factual information may be hindered. Also, as a consequence to their own success, some online content may be entirely generated by LLMs that are prone to hallucinated information. In addition, in specialized domains, online information can be unreliable, harmful, and contain logical inconsistencies that may hinder the models' reasoning ability. However, most previous works on common sense detection have focused on the general domain [1-2].

In this task, we seek to address the problem of identifying and correcting (common sense) medical errors in clinical notes. From a human perspective, these errors require medical expertise and knowledge to be both identified and corrected. 


[1] SemEval-2020 Task 4: Commonsense Validation and Explanation. Cunxiang Wang, Shuailong Liang, Yili Jin, Yilong Wang, Xiaodan Zhu, Yue Zhang. 

[2] CREAK: A Dataset for Commonsense Reasoning over Entity Knowledge. Yasumasa Onoe, Michael J.Q. Zhang, Eunsol Choi, Greg Durrett. 

Tasks


Participants will be given a snippet of clinical text and asked to:

Registration, Datasets & Evaluation 



Registration:


Schedule   

 All deadlines are 11:59PM UTC-12:00 (anywhere on Earth)


Contact    


Organizers