I am an assistant professor in Department of Biomedical Informatics of Stony Brook University. Prior to that, I was a staff research scientist at IBM T. J. Watson Research Center in New York from 2016 to 2023, and a researcher at IBM Research-Tokyo from 2015 to 2016. I obtained my Ph.D. from The University of Tokyo, M.S. from Peking University, and B.E. from Tsinghua University.
My research interests include machine learning, natural language processing (NLP) and biomedical informatics. In particular, my current research is mainly focused on deep graph learning, and I also worked on a variety of applications in healthcare and NLP areas. My profile in Stony Brook University is here for BMI or here for AI Institute.
(1) Deep Graph Learning: Deep Graph Learning (DGL) is a field of deep learning which analyzes data with graph structures, such as social networks, drug-target interaction networks, molecules. In the past few years, I was dedicated to improving graph neural networks to make them more powerful, efficient, and adaptable to various scenarios. My research on DGL covers (but is not limited to) the following aspects: scalability (FastGCN [ICLR18], IGB [KDD23]), dynamic graph learning (EvolveGCN [AAAI20]), graph generation (Constrained GraphVAE [NeurIPS18], Federated Feature Fusion [UAI23]), graph learning with geometry and topology ([ICLR20, ICML21, ICML22, NeurIPS22]), graph coarsening ([AAAI20, EMNLP21]), graph pretraining ([TMLR25]).
(2) Healthcare and Biomedical Research: Many of the applications of graph neural networks have a biomedical context (e.g. drug discovery), so it is natural to do biomedicine-related research as a graph learning researcher. Beyond that, I also work on computational healthcare problems, i.e. EHR (electronic health record) analysis. My collaborators and I have developed many new deep learning models, especially graph neural networks ([AAAI19, IJCAI19]), neuro-symbolic models [ICDM22, ICLR23], and time series foundation models [NeurIPS 24, ICLR 25], to solve various healthcare prediction problems, e.g. patient phenotyping and disease risk analysis, readmission prediction, medication recommendation.
(3) Natural Language Processing: I have been working on the topics of natural language processing since my early career. Enabling machines to understand natural languages is critical to achieve real intelligence. My early work in NLP focused on how to generate good document representations and summaries (e.g. [EMNLP 20, EMNLP 21, EMNLP 21 findings, EMNLP 23]). Recently the focus shift to the trustworthiness/transparency of LLMs (e.g. [NAACL 22, TMLR26, ICML 26]) and medical NLP (e.g. [ACL 26 findings]).
For more information, please check my publications or Google Scholar profile.
Information for Collaboration
I am open to collaborations with highly motivated students and researchers with strong machine learning and mathematical backgrounds.
In SBU, I take students from CS, BMI and AMS. Students from any of these departments are welcome to contact me regarding collaboration opportunities or RA positions. For master students, unfortunately I do not have RA positions for you, but you are welcome to do an independent study (e.g. CSE593) in my lab for credits.