Soojung Yang
KOR name: 양수정
soojungy [at] mit [dot] edu / she/her
KOR name: 양수정
soojungy [at] mit [dot] edu / she/her
Hi, I am a fourth-year PhD candidate at MIT in Computational & Systems Biology program, advised by Prof. Rafael Gómez-Bombarelli.
My research focuses on modeling protein dynamics by integrating molecular simulations, protein foundation models, and experimental measurements.
Linkedin / Google Scholar / twitter /
[News!] My paper 'Learning Collective Variables for Protein Folding ...' got the best paper award at the MoML conference 2024! 🤩
Research
Modeling protein dynamics and conformational ensembles with molecular simulations, foundation models, and experimental data.
2025
Unifying Force Prediction and Molecular Conformation Generation Through Representation Alignment
Lucas Pinede1, Soojung Yang1, Juno Nam, Rafael Gómez-Bombarelli
ICML GenBio Workshop, 1co-first
👉 Training of diffusion model-based Boltzmann emulator can be accelerated via alignment with pretrained MLIPs.
Learning Collective Variables from Time-lagged Generation
Seonghyun Park, Kiyoung Seong, Soojung Yang, Rafael Gómez-Bombarelli, Sungsoo Ahn
ICML GenBio Workshop
Scalable emulation of protein equilibrium ensembles with generative deep learning
Sarah Lewis1, Tim Hempel1, José Jiménez Luna1, Michael Gastegger1, Yu Xie1, Andrew Y. K. Foong1, Victor García Satorras1, Osama Abdin1, Bastiaan S. Veeling1, Iryna Zaporozhets, Yaoyi Chen, Soojung Yang, Arne Schneuing, Jigyasa Nigam, Federico Barbero, Vincent Stimper, Andrew Campbell, Jason Yim, Marten Lienen, Yu Shi, Shuxin Zheng, Hannes Schulz, Usman Munir, Cecilia Clementi, Frank Noé
Science
👉 I developed PPFT (Property Prediction Fine-Tuning), a method to fine-tune a diffusion model with ensemble statistics.
Transferable Learning of Reaction Pathways from Geometric Priors
Juno Nam, Miguel Steiner, Max Misterka, Soojung Yang, Avni Singhal, Rafael Gómez-Bombarelli
arXiv
2024
Probing the Embedding Space of Protein Foundation Models through Intrinsic Dimension Analysis
Soojung Yang, Juno Nam, Tynan Perez, Jinyeop Song, Xiaochen Du, Rafael Gómez-Bombarelli
NeurIPS AIDrugX Workshop
👉 Protein foundation models are aligned with each other, across modalities and architectures.
Flow Matching for Accelerated Simulation of Atomic Transport in Materials
Juno Nam, Sulin Liu, Gavin Winter, KyuJung Jun, Soojung Yang, Rafael Gómez-Bombarelli
arXiv
Soojung Yang1, Juno Nam1, Johannes C. B. Dietschreit, Rafael Gómez-Bombarelli
JCTC, 1co-first
👉 Geodesic interpolations can serve as a prior for CV for accelerated MD simulations.
2023
Regularized indirect learning improves phage display ligand discovery
Joseph S. Brown, Yitong Tseo, Michael A. Lee, Jeffrey Y.-K. Wong, Soojung Yang, Yehlin Cho, Chae Rin Kim, Andrei Loas, Ratmir Derda, Rafael Gómez-Bombarelli, Bradley L. Pentelute
ChemRxiv
Chemically Transferable Generative Backmapping of Coarse-Grained Proteins
Soojung Yang, Rafael Gómez-Bombarelli
ICML
👉 Protein local fluctuations can be transferably learned from a structural ensemble database.
2022
Seokhyun Moon1, Wonho Zhung1, Soojung Yang1, Jaechang Lim, Woo Youn Kim
Chemical Science, 1co-first
2021
Hit and Lead Discovery with Explorative RL and Fragment-based Molecule Generation
Soojung Yang, Doyeong Hwang, Seul Lee, Seongok Ryu, Sung Ju Hwang
NeurIPS
2020
Comprehensive Study on Molecular Supervised Learning with Graph Neural Networks
Doyeong Hwang, Soojung Yang , Yongchan Kwon , Kyung Hoon Lee, Grace Lee, Hanseok Jo, Seyeol Yoon, Seongok Ryu
J Chem Inf Model
A comprehensive study on the prediction reliability of graph neural networks for virtual screening
Soojung Yang, Kyung Hoon Lee, Seongok Ryu
arXiv
Community Service
Program chair and organizer of the integrating Generative and Experimental platforms for bioMolecular design (GEM) at ICLR 2024 in Vienna, Austria, and ICLR 2025 in Singapore.
Invited Talks & Poster Presentations
Learning Collective Variables for Protein Folding with Labeled Data Augmentation through Geodesic Interpolation
MIT Computational & Systems Biology Student Seminar (Feb 2024)
Harvard Medical School Debora Marks Lab Invited Seminar (Mar 2024)
IMSI workshop Learning Collective Variables and Coarse Grained Models (Apr 2024)
EPFL Michele Ceriotti Lab Invited Seminar (Aug 2024)
MoML Best Paper Award (Nov 2024)
Chemically Transferable Generative Backmapping of Coarse-Grained Proteins
Learning on Graphs and Geometry (LoG) Reading Group (Apr 2023)
POSTECH ML Learning Group Seminar (Apr 2023)
Flagship Pioneering Invited Company Seminar (May 2023)
CECAM/Psi-k Conference “Bridging length scales with machine learning: from wavefunctions to thermodynamics” Invited Talk/Panel Discussion (May 2023)
ICML Poster (Jul 2023)
Boston Protein Design and Modeling Club Seminar (Dec 2023)
Hit and Lead Discovery with Explorative RL and Fragment-based Molecular Generation
KAIST AI Student Colloquium (Oct 2021)
NeurIPS Poster (Dec 2021)
Experience
Microsoft Research AI4Science
Berlin, Germany (Jun 2023 - Aug 2023)Research Intern (Mentor: Yu Xie, PI: Frank Noe)
AITRICS
Seoul, Korea (Aug 2020 - Jun 2021)ML Researcher
ACE (Advanced Computational Engine) Team, Dept. of Chemistry, KAIST
Daejeon, Korea (Jun 2018 - Aug 2020)Undergraduate Researcher (supervised by Prof. Woo Youn Kim)
Wellman Center of Photomedicine, MGH
Boston, MA, U.S. (Jun 2019 - Aug 2019)Research Intern (supervised by Prof. Walfre Franco)
Selective Honors
Takeda Fellowship, 2023-2024
D.E.Shaw Research Graduate Women's Fellowship, 2023
Ilju Foundation Scholarship for Ph.D Studies, 2021-2025