Publications
Representations Shape Weak-to-Strong Generalization: Theoretical Insights and Empirical Predictions
Yihao Xue, Jiping Li, Baharan Mirzasoleiman
ICML, 2025
Few-shot Adaption to Distribution Shifts By Mixing Source and Target Embeddings
Yihao Xue, Ali Payani, Yu Yang, Baharan Mirzasoleiman
ICML, 2024
Investigating the Impact of Model Width and Density on Generalization in Presence of Label Noise
Yihao Xue, Kyle Whitecross, Baharan Mirzasoleiman
UAI, 2024 (Spotlight)
Understanding the Robustness of Multi-modal Contrastive Learning to Distribution Shift
Yihao Xue, Siddharth Joshi, Dang Nguyen, Baharan Mirzasoleiman
ICLR, 2024
Investigating the Benefits of Projection Head for Representation Learning
Yihao Xue, Eric Gan, Jiayi Ni, Siddharth Joshi, Baharan Mirzasoleiman
ICLR, 2024
Which Features are Learnt by Contrastive Learning? On the Role of Simplicity Bias in Class Collapse and Feature Suppression.
Yihao Xue,  Siddharth Joshi, Eric Gan, Pin-Yu Chen, Baharan Mirzasoleiman
ICML, 2023 (Oral Presentation, 2.37%)
[project page]
Investigating Why Contrastive Learning Benefits Robustness Against Label Noise
Yihao Xue, Kyle Whitecross, Baharan Mirzasoleiman
ICML, 2022
Preprints
LoRA is All You Need for Safety Alignment of Reasoning LLMs
Yihao Xue, Baharan Mirzasoleiman
[code]
Verify when Uncertain: Beyond Self-Consistency in Black Box Hallucination Detection
Yihao Xue, Kristjan Greenewald, Youssef Mroueh, Baharan Mirzasoleiman
Towards Mitigating Spurious Correlations in the Wild: A Benchmark & a more Realistic Dataset
Siddharth Joshi, Yu Yang, Yihao Xue, Wenhan Yang and Baharan Mirzasoleiman