Kyubum "Kyu" Lee, PhD

Biomedical AI/ML Scientist

Principal Data Scientist at Amgen

Thousand Oaks, California, USA

E-mail: KYUBUMLEE [at] gmail [dot] com

            

Summary


Experience / Education

Next-generation clinical trial design and analysis

- Improving the efficiency of clinical trial and drug development process using ML/NLP 

- Clinical trial text analysis using LLMs and BERT-based methods

- Develop a chatbot for clinical trial data and literature


• Develop deep learning methods for oral and lung cancer histopathology image analysis

- Deep learning-based (Mask R-CNN) image analysis method for understanding tumor heterogeneity and tumor microenvironment

•  Comprehensive ORAL cancer Explorer (CORALE) project: develop a multi-omics clinicopathological data web portal 

- Processed 62 multi-omics datasets: 9 RNA-seq dataset alignment, 53 microarray dataset normalization and clinical feature normalization

- Developed bioinformatics analysis and visualization tools 

•  Participate in oral and lung cancer-related biomedical informatics analysis projects   

• Orchestrated a research project which involved using deep learning for biomedical literature triage – Collaboration with the European Bioinformatics Institute (UK) and the Swiss Institute of Bioinformatics (Switzerland): reduced the manual workload by nearly 70% using machine learning + NLP

• Develop a web-based machine learning platform that provides tools for classifying biomedical literature

• Conduct analyses of publications on precision health to identify human genes of translational value – Collaboration with the Centers for Disease Control and Prevention (CDC)

• Participated in designing the search engine LitVar which retrieves genomic variants in biomedical documents in PubMed and PMC 

• Orchestrated a research project on building a cancer mutation knowledge-base (VarDrugPub) which involved using deep learning for extracting gene-drug-disease-mutation relations from biomedical literature 

• Mentoring graduate/undergraduate students

• Built Biomedical Entity Relation Corpus (BRONCO) which contains gene-drug-disease-mutation relations found in biomedical texts

• Extracted gene-drug relations from biomedical texts using NLP tools for creating the Drug Signatures Database (DSigDB

• Collaborated cross-functionally with researchers in translational bioinformatics and cancer biology, providing advice on biomedical projects

• PhD Thesis: Text mining approaches for knowledge extraction from biomedical literature

• VarDrugPub: Extracting Gene-Variant-Drug information from biomedical literature – Web service / Database

• HiPub: An application that translates biomedical texts to networks – Chrome extension

• ChimerDB 3.0 (ChimerPub): A database for fusion genes from biomedical literature – Web service / Database

• M.S. Thesis: Drug-drug interaction analysis using heterogeneous biological information network


Skills / Experience


Links for Recent Projects

LitSuggest: A Web-based System for Literature Recommendation and Curation using Machine Learning (Co-First Author / Developed ML core and Data processing part.) [Link] [Publication in Nucleic Acids Research]

CORALE: Comprehensive ORAL cancer Explorer (First Author / Data processing and Analysis tool developing) [Link] [Publication in progress]

Literature Triage using Machine Learning (First Author) [GitHub] [Publication in PLOS Computational Biology]

VarDrugPub: Variant-Gene-Drug relations Database (First Author) [Link] [Publication in BMC Bioinformatics] 

TIMEx: Tumor-immune microenvironment deconvolution web-portal for bulk transcriptomics using pan-cancer scRNA-seq signatures (Participated in data processing) [Link] [Pulbication in Bioinformatics]

LitVar: Search Engine for Genomic Variants in PubMed and PMC [Link] [Publication in Nucleic Acids Research]

HiPub: An application that translates biomedical texts to networks (First Author) [Link] [Publication in Bioinformatics] [AltMetric]

BRONCO: Biomedical entity Relation ONcology COrpus for extracting gene-variant-disease-drug relations (First Author) [Link] [Publication in Database]

ChimerDB 3.0 (Co-first author, Built a fusion gene DB using Text-mining - ChimerPub) [Link] [Wikipedia] [Publication in Nucleic Acids Research] [PDF]

BEST:  Biomedical Entity Search Tool (Actively Involved) [Link] [Publication in PLOS ONE]

BEReX:  Biomedical Entity-Relationship eXplorer (Actively Involved) [Link] [Publication in Bioinformatics]

DSigDB:  Drug SIGnatures DataBase (Involved in building text-mined drug-gene sets) [Link] [Publication in Bioinformatics]


Other Links:

Google Scholar

ResearchGate

PubMed

LinkedIn


Awards:


Publications:

Journals:

        

Conferences / Meetings:

Proceedings:

Posters:

Talks:  


Education:

Korea University / Data Mining & Information Systems Lab (Advisor: Prof. Jaewoo Kang) - Seoul, Korea

    Ph.D. in Computer Science and Engineering (Data Mining and Machine Learning): September 2012 to February 2017

    M.S. in Computer Science and Bioinformatics (Bioinformatics): September 2010 to August 2012

    B.S. in Computer Science: September 2008 to August 2010

    B.S. in Life Science: March 2002 to August 2008 (On leave: 2003–2005, Military Service)