Jason Xiaotian Dou






Hi, היי, Bonjour, Hola, 你好!  

Email: jasondpku@gmail.com; xdou@mgh.harvard.edu

Wechat: JasonDouProfessional

I am the Founder of Marbella, which provides customized (Generative) AI solutions to companies in the traditional industries. Marbella is currently incubated at the Harvard Innovation Labs and here is a recent pitch!

I have been a Postdoc Research Fellow in AI at Harvard University. Most recently I did a Ph.D. in Computer Engineering at the University of Pittsburgh working on Machine Learning with applications in Healthcare, Biology,  Moblie,  Finance,  Social Science, and Operations.

Previously I did my B.S. in Computer Science from Peking University with a Thesis at Carnegie Mellon University. I also did a Master's study at Cornell University.

I am looking for full-time opportunities in both academia and industry (Tech, Finance, Healthcare, etc.)  Let me know if you have any exciting opportunities!

My current research interest lies in Representation Learning, Computational Oncology, Knowledge Graphs, Causal Machine Learning, Clinical Decision Support, Reinforcement Learning,  Artificial Intelligence of Things (AIoT), and Multimodal Learning.  I also have a keen interest in technology commercialization and entrepreneurship. 

I love to collaborate with people from diverse backgrounds so feel free to get in touch!


Super excited to join the new AI for Mental Health Venture at Harvard!

I defended my dissertation "Learning Effective Representation Efficiently: Models, Applications, and Metrics" successfully in July 2023!

The paper "Retrieving Knowledge of Molecular Regulatory Mechanisms from PubMed Titles via an Event Extraction Approach" was accepted by the IEEE International Conference on Biomedical and Health Informatics (BHI'23) !

The paper "Recent Advances, Applications and Open Challenges in Machine Learning for Health: Reflections from Research Roundtables at ML4H 2022 Symposium" was put online! 

Serve as a reviewer for KDD 2023 and  ICCV 2023.

I passed my Ph.D. thesis proposal titled "Learning Effective Representation Efficiently: Models, Applications, and Metrics"!

The paper "The Measurement of Knowledge in Knowledge Graphs" was accepted by R2HCAI: The AAAI 2023 Workshop on Representation Learning for Responsible Human-Centric AI!

The paper "Learning More Effective Cell Representations Efficiently" was accepted by Neurips 2022 Workshop  Learning Meaningful Representations of Life (LMRL) as a poster!

Serve as the junior chair for the roundtable discussion "How to effectively integrate multiple data sources (e.g., EHR, images, genomics) for ML applications in healthcare?" at ML4H 2022.

The paper "Demystify the Gravity Well in the Optimization Landscape" was accepted by AAAI 2023 SA track!

Won the "Going Face-to-Facetime" Track at Pitt Challenge as Team Jet. We designed and built CloverBot, an AI emotional support chatbot in 6 hours!

The paper "Towards Cross-Modal Causal Structure and Representation Learning" (poster) was accepted by Machine Learning for Health 2022. Kudos to my friend and coauthor Haiyi Mao!

The poster "A Machine Learning Approach to Lung Cancer Treatment Trajectory Analysis After Immunotherapy" was presented at the UPMC Hillman Cancer Center 2022 Annual Scientific Retreat.

Serve as a reviewer for CVPR 2023, AISTATS 2023, Journal of Combinatorial Optimization, Neurips 2022 AI4Science Workshop, and Neurips 2022 MetaLearn workshop.

The poster "Enhance 'Similar' Cell Identification Through Optimal Transport"  was presented at the 3rd Center for Systems Immunology Annual Retreat. 

The paper "Serological profiling using an Epstein-Barr virus mammalian expression library identifies EBNA1 IgA as a pre-diagnostic marker for nasopharyngeal carcinoma" was published at Clinical Cancer Research!

The paper "Sampling Through the Lens of Sequential Decision Making" was put on arxiv.org. 

The abstract "Retrieving Knowledge of Molecular Mechanisms from Literature Titles via an Event Extraction Approach" was presented at ICIBM 2022.

Joined the ASCO's journals reviewer trainee program for JCO Clinical Cancer Informatics.

The paper COEM: Cross-Modal Embedding for MetaCell Identification (poster) was published and presented at the ICML  2022 Computational Biology Workshop.

Serve as a program committee member for the 26th UK Conference on Medical Image Understanding and Analysis.

Got second place in the first CMU-Pitt Computational Biology Hackathon for the project OceanCells!

Serve as a reviewer for Neurips 2022  and Neurips 2022 Dataset and Benchmark track.

Serve as a reviewer for one of top computer vision conferences ECCV 2022 and the top data mining conference KDD 2022.

Serve as a reviewer for the leading conference in Machine Learning ICML 2022 :).

Serve as a reviewer for the leading conference in Computer Vision CVPR 2022.

The paper on Wasserstein Metric Learning (Poster)  is published by the leading Artificial Intelligence conference AAAI 2022 SA track.

Doing Machine Learning in Dentistry internship at overjet.ai in summer 2021. 

I joint University of Pittsburgh as a PhD student in Machine Learning.