Evaluation of the Configuration Options in MetaMap for Processing Clinical Actionable Genomics Texts - A Pilot Study

Shreya Tellur  

Authors:   Shreya Tellur, Omika Merchant, and Dr. Xia Jing

Faculty Mentor:   Dr. Xia Jing

College: College of Behavioral, Social and Health Sciences



ABSTRACT 

Natural Language Processing (NLP) can facilitate information processing efficiently. One area is in precision medicine to extract clinically actionable genomics information automatically. UMLS, a biomedical terminology hub to aid interoperability between computers, is developed by the National Medical Library (NLM). MetaMap, is an NLP tool by NLM to identify UMLS concepts from biomedical texts. Although MetaMap has been used broadly, the options within MetaMap present a challenge to process a specific type of information (e.g., genomics). By using classic approaches in information retrieval, we manually evaluated parsed biomedical text by MetaMap by comparing output results to a given Golden Standard Text. The effort focused on the behavior option of MetaMap, which includes 17 items. To obtain a more objective judgment of output results, we developed metrics that classified results in exact, similar, and incorrect mappings. We then calculated the precision, recall, and F(β=0.33) measures of each of the 17 items. Based on the F measure, we deemed options as relevant (≥= 50%), too broad (40.0% - 49.9%), too specific (30.0% - 39.9%), or not relevant (≤=29.9%) to create a comprehensive table of configurations. Our results showed that 12 MetaMap items under the behavior option provided the most relevant results.



Video Introduction 

Shreya Tellur 2020 Undergraduate Research Symposium