Assistant professor
Department of Biomedical Informatics & Data Science
School of Medicine
Yale University
Contact: qingyu.chen@yale.edu
Assistant professor
Department of Biomedical Informatics & Data Science
School of Medicine
Yale University
Contact: qingyu.chen@yale.edu
We always have open positions and look for talents. Please read this before emailing me. Without reading this, your email will be redirected to my best friend, Reviewer #2, without a reply :)
Welcome! I'm a tenure-track Assistant Professor at the Department of Biomedical Informatics & Data Science, School of Medicine, starting in 2024. Prior to this, I completed my postdoctoral training at the National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health. I hold a PhD in Computer Science (Biomedical Informatics) from the University of Melbourne.
My research focuses on data science and artificial intelligence (AI) in biomedicine and healthcare. I have led significant milestones from data generation to method development and practical applications. My research interests can be broadly categorized into three areas:
Biomedical Natural Language Processing and Large Language Models
Medical Image and Multimodal Analysis in Healthcare
Downstream Accountability and Trustworthy AI for Medical Applications
I am the Principal Investigator of the R01 grant on improving factuality of LLMs in medicine (see news) and K99/R00 grant on multimodal AI-assisted disease diagnosis (see news). I have published over 40 first/last-author papers out of a total of 80+ publications within these research areas. My work has been in venues such as Nature, Nature Medicine, Nature Machine Intelligence, Nature Aging, NPJ Digital Medicine, Nucleic Acids Research, among others.
My research has been recognized with several awards, including the NIH Fellows Award for Research Excellence (twice), AI Talent Scholar (Top 50 in AI across disciplines, selected by Baidu Scholar), NLM Honor Award (twice), and top-ranked performances in biomedical and clinical NLP challenges (four times, three as first author).
I have also taught over 20 courses and mentored more than 10 trainees. My teaching and mentoring have been recognized with the NIH Summer Research Mentor Award (four times) and Excellence in Teaching Awards (twice).
Selected recent work
Outpatient Reception via Collaboration Between Nurses and a Large Language Model: A Randomized Controlled Trial. Nature Medicine, 2024.
GeneGPT: Augmenting Large Language Models with Domain Tools for Improved Access to Biomedical Information. Bioinformatics, 2024.
Advancing Entity Recognition in Biomedicine via Instruction Tuning of Large Language Models. Bioinformatics, 2024.
Improving Large Language Models for Clinical Named Entity Recognition via Prompt Engineering. Journal of the American Medical Informatics Association, 2024.
Large Language Models in Biomedical Natural Language Processing: Benchmarks, Baselines, and Recommendations. 2023.
LitCovid in 2022: An Information Resource for the COVID-19 Literature. Nucleic Acids Research, 2023. (LitCovid has been accessed millions of times per month.)
DeepLensNet: Deep Learning Automated Diagnosis and Quantitative Classification of Cataract Type and Severity. Ophthalmology, 2022.
Detecting Visually Significant Cataract Using Retinal Photograph-Based Deep Learning. Nature Aging, 2022.
Predicting Myocardial Infarction Through Retinal Scans and Minimal Personal Information. Nature Machine Intelligence, 2022.
Multimodal, Multitask, Multiattention (M3) Deep Learning Detection of Reticular Pseudodrusen: Toward Automated and Accessible Classification of Age-Related Macular Degeneration. Journal of the American Medical Informatics Association, 2021. (NLM Honor Award)
LitCovid: An Open Database of COVID-19 Literature. Nucleic Acids Research, 2021.
Keep Up with the Latest Coronavirus Research. Nature, 2021.
Large Language Models in Biomedicine: Selected Paper Series of Our Research
Outpatient Reception via Collaboration Between Nurses and a Large Language Model: A Randomized Controlled Trial. Nature Medicine, 2024.
Opportunities and Challenges for ChatGPT and Large Language Models in Biomedicine and Health. Briefings in Bioinformatics, 2024.
GeneGPT: Augmenting Large Language Models with Domain Tools for Improved Access to Biomedical Information. Bioinformatics, 2024.
Advancing Entity Recognition in Biomedicine via Instruction Tuning of Large Language Models. Bioinformatics, 2024.
Improving Large Language Models for Clinical Named Entity Recognition via Prompt Engineering. Journal of the American Medical Informatics Association, 2024.
Large Language Models in Biomedical Natural Language Processing: Benchmarks, Baselines, and Recommendations. 2023.
Large Language Models and the Retina: A Review of Current Applications and Future Directions. 2023.
Biomedical Foundation Models: Selected Paper Series of Our Research
Me llama: Foundation large language models for medical applications. 2024.
Bioformer: an efficient transformer language model for biomedical text mining. 2023.
MedCPT: Contrastive Pre-trained Transformers with large-scale PubMed search logs for zero-shot biomedical information retrieval. Bioinformatics. 2023.
BioConceptVec: Creating and evaluating literature-based biomedical concept embeddings on a large scale. PLoS computational biology. 2020
BioSentVec: creating sentence embeddings for biomedical texts. IEEE International Conference on Healthcare Informatics. 2019
BioWordVec, improving biomedical word embeddings with subword information and MeSH. Scientific data. 2019
AI in Healthcare: Selected Paper Series of Our Research
A deep network DeepOpacityNet for detection of cataracts from color fundus photographs. Nature Communications Medicine. 2023.
DeepLensNet: Deep Learning Automated Diagnosis and Quantitative Classification of Cataract Type and Severity. Ophthalmology, 2022.
Detecting Visually Significant Cataract Using Retinal Photograph-Based Deep Learning. Nature Aging, 2022.
Predicting Myocardial Infarction Through Retinal Scans and Minimal Personal Information. Nature Machine Intelligence, 2022.
Learning Structure from Visual Semantic Features and Radiology Ontology for Lymph Node Classification on MRI, MLMI, 2021.
Predicting risk of late age-related macular degeneration using deep learning. NPJ digital medicine, 2020.
DeepSeeNet: a deep learning model for automated classification of patient-based age-related macular degeneration severity from color fundus photographs, Ophthalmology, 2019.
Experience
Tenure-track assistant professor, Biomedical Informatics & Data Science, Yale School of Medicine, Yale University, 2024-
Incoming tenure-track assistant professor, Biomedical Informatics & Data Science, Yale School of Medicine, Yale University, 2023-2024
Completing my K99 phase and setting up my lab
Research fellow, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 2018-2023
Mentor: Dr Zhiyong Lu
Main awards:
AI Talent Scholar (Top 50 in AI in cross-disciplines) selected by Baidu Scholar, 2022
NIH Fellows Award for Research Excellence (two times), 2021 and 2023
NIH Summer Research Mentor Award (four times), 2020-2023
National Library of Medicine Honor Award, 2019
Top-ranked performance in National Natural Language Processing Clinical Challenges (N2C2) on clinical semantic textual similarities as the first author, 2019
Top-ranked performance in BioCreative Challenges on clinical semantic textual similarities as the first author, 2018
Grants:
Principle investigator, K99 LM014024-01, multimodal computer-assisted disease diagnosis
PhD in Computer Science (Biomedical Informatics), School of Computing and Information Systems, the University of Melbourne, 2014-2018
Mentors: Redmond Barry Distinguished Prof Justin Zobel and Prof Karin Verspoor
Main awards:
The recipient of the Microsoft Master Innovator Award
Top-ranked performance in 2017 BioCreative on protein interaction extractions as the first author
Excellence in Teaching Awards, 2016 and 2018
Honors degree in Computer Science, School of Science, RMIT University, 2013
Mentors: Prof Lin Padgham and Dr Dhirendra Singh
First-class honors; GPA ranked #1 in the degree
Bachelor of Computer Science, School of Science, RMIT University, 2010-2012
GPA ranked #1 in the degree