Kyubum Lee (Ph.D.)
Postdoctoral Fellow,
Moffitt Cancer Center
Tampa, Florida, USA

E-mail: KYUBUM.LEE [at] moffitt [dot] org
            KYUBUMLEE [at] gmail [dot] com
            

Summary
  • Expertise in machine learning, natural language processing (NLP), and biomedical informatics
  • Extensive experience and in-depth knowledge of deep learning, statistics, data/text analysis, visualization, data integration, and biomedical text data
  • Spearhead multiple projects collaborating with researchers in the computer science and biomedical fields
    First-author of 7 publications and co-author of 13 publications in peer-reviewed SCI journals 
    - Built 9 web-based services each with more than 1K users including LitVar, ChimerDB 3.0, HiPub, BEST, and BRONCO
  • Fluent in Python and vast experience with various Machine learning and BioNLP tools
Experience
            Cancer-related biomedical informatics projects
           • Cancer genomic data integration and analysis
           • Histopathological image analysis
    
 Orchestrated a research project which involved using deep learning for biomedical literature triage – Collaboration with the European Bioinformatics Institute (UK) and the Swiss Institute of Bioinformatics (Switzerland): reduced the manual workload by nearly 70% using machine learning + NLP
• Develop a web-based machine learning platform that provides tools for classifying biomedical literature
• Conduct analyses of publications on precision health to identify human genes of translational value – Collaboration with the Centers for Disease Control and Prevention (CDC)
• Participated in designing the search engine LitVar which retrieves genomic variants in biomedical documents in PubMed and PMC 

• Orchestrated a research project on building a cancer mutation knowledge-base (VarDrugPub) which involved using deep learning for extracting gene-drug-disease-mutation relations from biomedical literature 
• Mentoring graduate/undergraduate students
• Built Biomedical Entity Relation Corpus (BRONCO) which contains gene-drug-disease-mutation relations found in biomedical texts
• Extracted gene-drug relations from biomedical texts using NLP tools for creating the Drug Signatures Database (DSigDB
• Worked jointly on biomedical projects with researchers in translational bioinformatics and cancer biology 


Links for Recent Projects

Literature Triage using Machine Learning (First Author) [GitHub] [Publication in PLOS Computational Biology]

VarDrugPub: Variant-Gene-Drug relations Database (First Author) [Link] [Publication in BMC Bioinformatics] 

LitVar: Search Engine for Genomic Variants in PubMed and PMC [Link] [Publication in Nucleic Acids Research]

HiPub: An application that translates biomedical texts to networks (First Author) [Link] [Publication in Bioinformatics] [AltMetric]

BRONCO: Biomedical entity Relation ONcology COrpus for extracting gene-variant-disease-drug relations (First Author) [Link] [Publication in Database]

ChimerDB 3.0 (Co-first author, Built a fusion gene DB using Text-mining - ChimerPub) [Link] [Wikipedia] [Publication in Nucleic Acids Research] [PDF]

BEST:  Biomedical Entity Search Tool (Actively Involved) [Link] [Publication in PLOS ONE]

BEReX:  Biomedical Entity-Relationship eXplorer (Actively Involved) [Link] [Publication in Bioinformatics]

DSigDB: 
 Drug SIGnatures DataBase (Involved in building text-mined drug-gene sets) [Link] [Publication in Bioinformatics]


Other Links:

Google Scholar
ResearchGate
PubMed
LinkedIn


Awards:
  • Recognition for Honorary Award (Recognition and Appreciation of Special Achievement): National Library of Medicine, National Institutes of Health, U.S. Department of Health and Human Services; Dec. 2018
  • Best Paper of the Year Award: Korea University; Feb. 2017

Publications:

Journals:

    • Kyubum Lee, Chih-Hsuan Wei, and Zhiyong Lu*: Recent advances of automated methods for searching and extracting genomic variant information from biomedical literature. Briefings in Bioinformatics, 2020 [Full Text]

    • Qingyu Chen, Kyubum LeeShankai Yan, Sun Kim, and Zhiyong Lu*BioConceptVec: creating and evaluating literature-based biomedical concept embeddings on a large scale. PLOS Computational Biology, 2020. [Full Text]

    • Kyubum Lee, Mindy Clyne, Wei Yu, Zhiyong Lu*, and Muin Khoury*: Tracking human genes along the translational continuum. npj Genomic Medicine, 2019. [Full Text]

    • Kyubum LeeMaria Livia Famiglietti, Aoife McMahon, Chih-Hsuan Wei, Jacqueline Ann Langdon MacArthur, Sylvain Poux, Lionel Breuza, Alan Bridge, Fiona Cunningham, Ioannis Xenarios and Zhiyong Lu*Scaling up data curation using deep learning: An application to literature triage in genomic variation resources. PLOS Computational Biology, 2018. [Full Text] [Hot paper of the week at NIH Intramural Research News Letter]

    • Alexis Allot†, Yifan Peng†, Chih-Hsuan Wei, Kyubum Lee, Lon Phan, and Zhiyong Lu*LitVar: a semantic search engine for linking genomic variant data in PubMed and PMC. Nucleic Acids Research 2018. [Full Text] [Database Link]

    • Kyubum Lee, Byounggun Kim, Yonghwa Choi, Sunkyu Kim, Wonho Shin, Sunwon Lee, Sungjoon Park, Seongsoon Kim, Aik Choon Tan* and Jaewoo Kang*Deep learning of mutation-gene-drug relations from the literature. BMC Bioinformatics, 2018. DOI: 10.1186/s12859-018-2029-1 [Full Text] [Database Link]

    • Sangrak Lim, Kyubum Lee, Jaewoo Kang: Drug-drug interaction extraction from the literature using a recursive neural network. PLoS ONE, 2018. DOI: 10.1371/journal.pone.0190926 [Full Text]

    • Seongsoon Kim, Donghyeon Park, Yonghwa ChoiKyubum Lee, Byounggun Kim, Minji Jeon, Jihye Kim, Aik Choon Tan, Jaewoo Kang*A Pilot Study of Biomedical Text Comprehension using an Attention-Based Deep Neural Reader: Design and Experimental Analysis JMIR Medical Informatics 2017. DOI: 10.2196/medinform.8751 [Full Text]

    • Myunggyo Lee†, Kyubum Lee†, Namhee Yu†, Insu Jang†, Ikjung Choi, Pora Kim, Ye Eun Jang, Byounggun Kim, Sunkyu Kim, Byungwook Lee, Jaewoo Kang*, and Sanghyuk Lee*ChimerDB 3.0: an enhanced database for fusion genes from cancer transcriptome and literature data miningNucleic Acids Research 2017. DOI:10.1093/nar/gkw1083 ( These authors contributed equally) [Full Text] [Database Link]

    • Kyubum Lee, Wonho Shin, Byunggun Kim, Sunwon Lee, Yonghwa Choi, Sunkyu Kim, Minji Jeon, Aik Choon Tan* and Jaewoo Kang*HiPub: Translating PubMed and PMC Texts to Networks for Knowledge Discovery. Bioinformatics 08/2016; 32(18). DOI:10.1093/bioinformatics/btw511 [Full Text] [Link]

    • Kyubum Lee, Sunwon Lee, Sungjoon Park, Sunkyu Kim, Suhkyung Kim, Kwanghun Choi, Aik Choon Tan* and Jaewoo Kang*BRONCO: Biomedical entity Relation ONcology COrpus for extracting gene-variant-disease-drug relations. Database The Journal of Biological Databases and Curation 04/2016; 2016. DOI:10.1093/database/baw043 [Full Text]

    • Jocelyn Barbosa, Kyubum Lee, Sunwon Lee, Bilal Lodhi, Jae-Gu Cho, Woo-Keun Seo, Jaewoo Kang*Efficient quantitative assessment of facial paralysis using iris segmentation and active contour-based key points detection with hybrid classifier. BMC Medical Imaging 12/2016; 16(1). DOI:10.1186/s12880-016-0117-0 [Full Text]

    • Sunwon Lee†, Donghyeon Kim†, Kyubum Lee, Jaehoon Choi, Seongsoon Kim, Minji Jeon, Sangrak Lim, Donghee Choi, Sunkyu Kim, Aik-Choon Tan, Jaewoo Kang*BEST: Next-Generation Biomedical Entity Search Tool for Knowledge Discovery from Biomedical Literature. PLoS ONE 10/2016; 11(10). DOI:10.1371/journal.pone.0164680 († These authors contributed equally to the work.) [Full Text] [Link]

    • Minjae Yoo, Jimin Shin, Jihye Kim, Karen A Ryall, Kyubum Lee, Sunwon Lee, Minji Jeon, Jaewoo Kang, Aik Choon Tan*DSigDB: Drug Signatures Database for Gene Set Analysis. Bioinformatics 05/2015; 31(18). DOI:10.1093/bioinformatics/btv313 [Link] [Full Text]

    • Woo Keun Seo, Jaewoo Kang, Minji Jeon, Kyubum Lee, Sunwon Lee, Ji Hyun Kim, Kyungmi Oh, Seong Beom Koh: Feasibility of Using a Mobile Application for the Monitoring and Management of Stroke-Associated Risk Factors. Journal of Clinical Neurology 04/2015; 11(2). DOI:10.3988/jcn.2015.11.2.142 [Link]

    • Minji Jeon, Sunwon Lee, Kyubum Lee, Aik-Choon Tan, Jaewoo Kang*BEReX: Biomedical Entity-Relationship eXplorer. Bioinformatics 01/2014; 30(1). DOI:10.1093/bioinformatics/btt598 [Link] [Full Text]

    • Junkyu Lee, Seongsoon Kim, Sunwon Lee, Kyubum Lee, Jaewoo Kang*On the efficacy of per-relation basis performance evaluation for PPI extraction and a high-precision rule-based approach. BMC Medical Informatics and Decision Making 04/2013; 13(1). DOI:10.1186/1472-6947-13-S1-S7 [Link]

    • Jaehoon Choi, Donghyeon Kim, Seongsoon Kim, Sunwon Lee, Kyubum Lee, Jaewoo Kang*BOSS: Context-enhanced search for biomedical objects. BMC Medical Informatics and Decision Making 04/2012; 12 Suppl 1(Suppl 1). DOI:10.1186/1472-6947-12-S1-S7 [Link]

    • Hanjun Shin, Ki Hoon Kim, Chihwan Song, Injoon Lee, Kyubum Lee, Jaewoo Kang, Yoon Kyoo Kang*Electrodiagnosis support system for localizing neural injury in an upper limb. Journal of the American Medical Informatics Association 05/2010; 17(3). DOI:10.1136/jamia.2009.001594 [Link]

    • Sunwon Lee, Kyubum Lee, Jaewoo Kang*, Jaehoon Choi, Junho Oh:   Trends in Personalized Medicine Research. Communications of the Korean Institute of Information Scientists and Engineers, Vol.29, Issue 4, Pages:19-25, Apr 2011 [Written in Korean]

        

        Conferences / Meetings:

             Proceedings:

  • Chih-Hsuan Wei, Kyubum Lee, Robert Leaman, Zhiyong Lu: Biomedical Mention Disambiguation Using a Deep Learning Approach. The 10th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics (ACM-BCB 2019), Niagara Falls, NY; Sept 2019 [Link]
  • Donghyeon Kim†, Sunwon Lee†, Kyubum Lee, Jaehoon Choi, Seongsoon Kim, Minji Jeon, Sangrak Lim, Donghee Choi, Aik-Choon Tan, Jaewoo Kang*BEST: Next-Generation Biomedical Entity Search Tool for Knowledge Discovery from Biomedical LiteratureThe 5th Annual Translational Bioinformatics Conference (TBC 2015), Tokyo, Japan; Sept. 2015  († These authors contributed equally to the work.)
  • Kyubum Lee, Sunwon Lee, Minji Jeon, Jaehoon Choi, Jaewoo Kang*Drug-drug interaction analysis using heterogeneous biological information network. IEEE International Conference on Bioinformatics and Biomedicine (BIBM 2012), Philadelphia, USA; Oct. 2012 [Link]
  • Junkyu Lee, Seongsoon Kim, Sunwon Lee, Kyubum Lee, Jaewoo Kang*High Precision Rule Based PPI Extraction and Per-Pair Basis Performance Evaluation. ACM sixth international workshop on Data and text mining in biomedical informatics (DTMBIO 2012), Maui, Hawaii, USA; Oct. 2012
  • Kyubum Lee, Sunwon Lee, Jaewoo Kang*SNP Grouping Method Based on PPI Network Information. The 37th Conference of the Korea Information Processing Society, Apr 2012 [Written in Korean]
  • Taewon Joh, Kyubum Lee, Jaewoo Kang*: Comparative analysis of Biomedical Databases and Text mining Technologies. The 34th Conference of the Korea Information Processing Society, Nov 2010 [Written in Korean]
  • Hojun Kim, Seongyeon Won, Seungwoo Gang, Kyubum Lee, Byounggun Kim, Sunkyu Kim, Jaewoo Kang*Research on Identifying Mutation-Drug Relationship in Biomedical Literature Using Biomedical Context based pre-trained word embedding (KIPS 2017 Spring), Jeju, Korea; April 2017 [Written in Korean]
  Posters:
  • Kyubum Lee, Chih-Hsuan Wei, Livia Famiglietti, Sylvain Poux, Lionel Breuza, Alan Bridge, Ioannis Xenarios and Zhiyong Lu*: Scaling up data curation using deep learning: An application to literature triage in genomic variation resources. ISMB 2018, Chicago, USA; July 2018 
  • Kyubum Lee, Byounggun Kim, Yonghwa Choi, Sunkyu Kim, Wonho Shin, Sunwon Lee, Sungjoon Park, Seongsoon Kim, Aik Choon Tan and Jaewoo Kang*Deep learning of mutation-gene-drug relations from the literature for precision medicine. ISMB/ECCB 2017, Prague, Czech Republic; July 2017 (doi: 10.7490/f1000research.1114641.1) [Poster] [Link]
  Talks:
  • Scaling up data curation using machine learning: An application to literature triage in genomic variation resources. CBB Seminar, NCBI, NLM, NIH; Feb. 2019
  • Biomedical Literature Search, Mining and Applications: College of Medicine, Seoul National University; Nov. 2018 [Online talk]
  • Machine-assisted Variant CurationBiomedical Linked Annotation Hackathon 4 (BLAH4), Kashiwa, Japan; Jan. 2018 [Link]
Education:

Korea University / Data Mining & Information Systems Lab (Advisor: Prof. Jaewoo Kang) - Seoul, Korea

    Ph.D. in Computer Science and Engineering (Data Mining and Machine Learning): September 2012 to February 2017

    M.S. in Computer Science and Bioinformatics (Bioinformatics): September 2010 to August 2012

    B.S. in Computer Science: September 2008 to August 2010

    B.S. in Life Science: March 2002 to August 2008 (On leave: 2003–2005, Military Service)