I received my Ph.D. degree from Northwestern University, under the supervision of Prof. Ying Wu.
Before my Ph.D. studies, I received my Bachelors' degree from department of Automation, Tsinghua University.
My research interests mainly include:
Computer Vision, Multimodality, Vision-language Model, LLM, Generative AI
[Google Scholar] [LinkedIn] Email: xingxy0505@gmail.com
2021 - 2025, Ph.D. in Electrical Engineering, advised by Prof. Ying Wu, Northwestern University.
2017 - 2021, B.E. in Automation, Tsinghua University.
2024.06 - 2024.12, Research Scientist Intern, Tiktok, US.
- Intelligent Creation Department.
- Topic: Large vision-language model interpretation and fine-tuning.
2024.02 - 2024.06, Student Researcher, Google Research, US.
- Topic: Region-aware fine-tuning for text-to-image generation.
2021.03 - 2021.06, Research Scientist Intern, JD Technology, China.
- Visual Fundamental Research Department.
- Topic: Facial expression driven face animation.
2020.06 - 2021.02, Research Intern, University of Virginia, US.
- Advisor: Prof. Jundong Li
- Topic: Fairness-aware unsupervised feature selection.
2020.06 - 2020.09, Research Intern, University of Washington, US.
- Access Computing Program.
- Topic: Cough verification by sequential modeling.
• National Scholarship for Overall Performance Excellence, Tsinghua University
• Scholarship for Scientific and Technological Innovation Excellence, Tsinghua University
• Scholarship for Outstanding New Student, Northwestern University
• Outstanding Graduate Award, Tsinghua University
• Peer Award for Research Contributions, Google Research
• Selected to Chinese National College Student Innovation and Entrepreneurship Training Program
• National first prize in Chinese Undergraduate Physics Competition
• Second prize in AI challenge competition, Tsinghua University
*denotes equal contribution.
Q. Sun, X. Xing, H. Weng, C. Yeum, M. Crowley. "View Invariant Learning for Vision-Language Navigation in Continuous Environments". arXiv preprint.
M. Li, X. Gu, F. Chen, X. Xing, L. Wen, C. Chen, S. Zhu. "Superedit: Rectifying and facilitating supervision for instruction-based image editing". Accepted to ICCV 2025.
X. Xing, CW. Kuo, F. Li, Y. Niu, F. Chen, M. Li, Y. Wu, L. Wen, S. Zhu. "Where do Large Vision-Language Models Look at when Answering Questions?". arXiv preprint.
X. Xing, A. Saha, J. He, S. Hao, P. Vicol, M. Ryu, G. Li, S. Singla, S. Young, Y. Li, F. Yang, D. Ramachandran. "Focus-N-Fix: Region-Aware Fine-Tuning for Text-to-Image Generation". IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2025 highlight.
Y. Li, L. Fan, X. Xing, J. Zhou, Y. Wu. "GPVK-VL: Geometry-Preserving Virtual Keyframes for Visual Localization under Large Viewpoint Changes". IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2025.
X. Xing, P. Xiong, L. Fan, Y. Li, Y. Wu. “Learning to Ask Denotative and Connotative Questions for Knowledge-based VQA”. Findings of the Association for Computational Linguistics (EMNLP) 2024.
L. Fan, J. Zhou, X. Xing, Y. Wu. “Active Open-Vocabulary Recognition: Let Intelligent Moving Mitigate CLIP Limitations”. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024.
X. Xing, M. Liang, Y. Wu. “TOA: Task-oriented active VQA”. Conference on Neural Information Processing Systems (NeurIPS) 2023.
X. Xing, H. Liu, C. Chen, J. Li. “Fairness-Aware Unsupervised Feature Selection”. Proceedings of the ACM International Conference on Information & Knowledge Management (ACM CIKM) 2021.
C. Wang*, X. Xing*, Z. Su, J. Chen. “DCSFN: Deep Cross-scale Fusion Network for Single Image Rain Removal”. Proceedings of the ACM International Conference on Multimedia (ACM MM) 2020.
C. Wang, X. Xing, G. Yao, Z. Su. “Single Image Deraining via Deep Shared Pyramid Network”. The Visual Computer 2021.
Reviewer for NeurIPS 2025
Reviewer and emergency reviewer for CVPR 2025
Reviewer for Transactions on Audio, Speech and Language Processing
Reviewer for ECCV 2024
Reviewer for CVPR 2024
Embedded Artificial Intelligence (2024 winter)
Introduction to Computer Vision (2024 fall)
Special Topics: Computer Vision (2025 spring)