Bocheng Li [GitHub] [Google Scholar]
Ph.D. Student in School of Computer Science
State Key Laboratory of Cognitive Intelligence
University of Science and Technology of China (USTC)
Email: bcli (at) mail.ustc.edu.cn
About Me
I am a 3rd year Ph.D. student at University of Science and Technology of China (USTC), advised by Prof. Linli Xu. Prior to that, I completed my B.Eng in computer science from the same institution in 2023. My recent research focuses on diffusion models for discrete data(coling-24, acl-25) and representation learning for unified generative models(neurips-24, iccv-25).
Aside from my work, I spent a wonderful year with my friends at USTCLUG.
News
[July 2025] IDA-MoE is accepted to ACM MM 2025! 🎉
[June 2025] SimVQ is accepted to ICCV 2025! 🎉
[May 2025] NeoDiff is accepted to ACL 2025 main conference (oral presentation, 243 out of 3092 accepted papers)! 🎉
Publications
(* = equal contribution)
In the year of 2025:
Input Domain Aware MoE: Decoupling Routing Decisions from Task Optimization in Mixture of Experts
Yongxiang Hua, Haoyu Cao, Zhou Tao, Bocheng Li, Zihao Wu, Chaohu Liu and Linli Xu
In Proceedings of the 33rd ACM International Conference on Multimedia (ACM MM 2025)
An MoE routing framework using a probabilistic mixture model to partition the input space, achieving superior expert specialization, load balancing, and task performance over standard sMoE approaches.
Addressing Representation Collapse in Vector Quantized Models with One Linear Layer [arXiv] [Code] [Post on kexue.fm] [@lucidrain's VQ Repo]
Yongxin Zhu, Bocheng Li, Yifei Xin and Linli Xu
In Proceedings of the International Conference on Computer Vision 2025 (ICCV 2025)
A novel method which reparameterizes the code vectors through a linear transformation layer based on a learnable latent basis to resolve the representation collapse problem in VQ models.Unifying Continuous and Discrete Text Diffusion with Non-simultaneous Diffusion Processes [arXiv] [ACL Anthology] [Code]
Bocheng Li*, Zhujin Gao* and Linli Xu
In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025), Oral Presentation
Text diffusion model with fine-grained noising control, context-aware reverse process and optimized noising schedule. Outperforms several discrete/continuous diffusion & NAR baselines on conditional text generation.
In the year of 2024:
Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective [arXiv] [Code]
Yongxin Zhu, Bocheng Li, Hang Zhang, Xin Li, Linli Xu and Lidong Bing
In Proceedings of the Thirty-eighth Annual Conference on Neural Information Processing Systems (NeurIPS 2024)
An auto-regressive generative model performing next-token prediction in an abstract latent space derived from self-supervised learning (SSL) models, SOTA performance for AR image generation & understanding on ImageNet.
Few-shot Temporal Pruning Accelerates Diffusion Models for Text Generation [ACL Anthology] [Code]
Bocheng Li, Zhujin Gao, Yongxin Zhu, Kun Yin, Haoyu Cao, Deqiang Jiang and Linli Xu
In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (COLING 2024)
Straightforward Bayesian optimization to accelerate Multinomial Diffusion, Absorbing Diffusion, and DiffuSeq by up to 400x in less than 1 minute.
Experience
Research Intern, Tencent YouTu Research, March 2023 - October 2023
President of USTC Linux User Group, May 2022 - May 2023
Awards
USTC-Suzhou Industrial Park Scholarship, November 2024
Excellence Award (Advanced Game Theory Algorithm, Top 10%), Tencent AIArena National Open Competition for Artificial Intelligence, December 2023
USTC Outstanding Thesis Award (Bachelor Thesis; 83 out of 1860), June 2023
Teaching Assistant
Basics of Artificial Intelligence, 2023, 2024 Spring (USTC)
Computer Systems: A Programmer's Perspective[CS:APP], 2022 Spring (USTC)
Services
Reviewer: ACL ARR, NeurIPS 2024/2025, ICLR 2025