SGLang: Efficient Execution of Structured Language Model Programs
Lianmin Zheng*, Liangsheng Yin, Zhiqiang Xie, Chuyue Sun, Jeff Huang, Cody Hao Yu, Shiyi Cao, Christos Kozyrakis, Ion Stoica, Joseph E Gonzalez, Clark Barrett, Ying Sheng*.
NeurIPS 2024.
Efficient Algorithms for Automated Reasoning and Large Language Models
Ying Sheng
PhD Thesis (It basically repeats my previous papers, but the acknowledgments I wish to express make it unique.)
Clover: Closed-Loop Verifiable Code Generation
Chuyue Sun*, Ying Sheng*, Oded Padon, Clark Barrett.
SAIV 2024.
POPL Dafny Workshop 2024.
Chatbot arena: An open platform for evaluating LLMs by human preference
Wei-Lin Chiang, Lianmin Zheng, Ying Sheng, Anastasios Nikolas Angelopoulos, Tianle Li, Dacheng Li, Hao Zhang, Banghua Zhu, Michael Jordan, Joseph E Gonzalez, Ion Stoica.
ICML 2024.
Fairness in Serving Large Language Models
Ying Sheng, Shiyi Cao, Dacheng Li, Banghua Zhu, Zhuohan Li, Danyang Zhuo, Joseph E. Gonzalez, Ion Stoica.
OSDI 2024.
S-LoRA: Serving Thousands of Concurrent LoRA Adapters
Ying Sheng*, Shiyi Cao*, Dacheng Li, Coleman Hooper, Nicholas Lee, Shuo Yang, Christopher Chou, Banghua Zhu, Lianmin Zheng, Kurt Keutzer, Joseph E Gonzalez, Ion Stoica.
MLSys 2024.
LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset
Lianmin Zheng*, Wei-Lin Chiang*, Ying Sheng, Tianle Li, Siyuan Zhuang, Zhanghao Wu, Yonghao Zhuang, Zhuohan Li, Zi Lin, Eric Xing, Joseph E Gonzalez, Ion Stoica, Hao Zhang.
ICLR 2024. (Spotlight top~5%)
Judging LLM-as-a-judge with MT-Bench and Chatbot Arena
Lianmin Zheng*, Wei-Lin Chiang*, Ying Sheng*, Siyuan Zhuang, Zhanghao Wu, Yonghao Zhuang, Zi Lin, Zhuohan Li, Dacheng Li, Eric Xing, Hao Zhang, Joseph E Gonzalez, Ion Stoica.
NeurIPS Datasets and Benchmarks, 2023.
H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models
Zhenyu Zhang, Ying Sheng, Tianyi Zhou, Tianlong Chen, Lianmin Zheng, Ruisi Cai, Zhao Song, Yuandong Tian, Christopher Ré, Clark Barrett, Zhangyang Wang, Beidi Chen.
NeurIPS 2023. ICML ES-FoMo Workshop 2023.
Efficient Memory Management for Large Language Model Serving with PagedAttention
Woosuk Kwon*, Zhuohan Li*, Siyuan Zhuang, Ying Sheng, Lianmin Zheng, Cody Hao Yu, Joseph E. Gonzalez, Hao Zhang, Ion Stoica.
SOSP 2023.
FlexGen: High-throughput Generative Inference of Large Language Models with a Single GPU
Ying Sheng, Lianmin Zheng, Binhang Yuan, Zhuohan Li, Max Ryabinin, Daniel Y. Fu, Zhiqiang Xie, Beidi Chen, Clark Barrett, Joseph E. Gonzalez, Percy Liang, Christopher Ré, Ion Stoica, Ce Zhang.
ICML 2023. (Oral top~8%)
Reasoning About Vectors using an SMT Theory of Sequences
Ying Sheng, Andres Nötzli, Andrew Reynolds, Yoni Zohar, David Dill, Wolfgang Grieskamp, Junkil Park, Shaz Qadeer, Clark Barrett and Cesare Tinelli.
the International Joint Conference on Automated Reasoning (IJCAR), 2022, part of FLoC 2022. (Nominated for Best Paper)
cvc5: A Versatile and Industrial-Strength SMT Solver.
(alphabetically) Haniel Barbosa, Clark Barrett, Martin Brain, Gereon Kremer, Hanna Lachnitt, Makai Mann, Abdalrhman Mohamed, Mudathir Mohamed, Aina Niemetz, Andres Nötzli, Alex Ozdemir, Mathias Preiner, Andrew Reynolds, Ying Sheng, Cesare Tinelli, and Yoni Zohar.
Best Tool Paper Award at the 28th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS), Springer, 2022.
20th International Workshop on Satisfiability Modulo Theories (SMT 2022), Affiliated with IJCAR 2022, part of FLoC 2022.
Politeness for the Theory of Algebraic Datatypes
Ying Sheng, Yoni Zohar, Christophe Ringeissen, Jane Lange, Pascal Fontaine, Clark Barrett.
Best Paper Award at the International Joint Conference on Automated Reasoning (IJCAR), 2020.
Sister Conferences Best Papers at the 30th International Joint Conference on Artificial Intelligence (IJCAI), 2021.
the Special Issue of JAR 2022 dedicated to IJCAR 2020.
Distribution-free Junta Testing
(alphabetically) Xi Chen, Zhengyang Liu, Rocco Servedio, Ying Sheng, Jinyu Xie.
Symposium on Theory of Computing (STOC), 2018.
How Long Can Context Length of Open-Source LLMs truly Promise?
Dacheng Li, Rulin Shao, Anze Xie, Ying Sheng, Lianmin Zheng, Joseph Gonzalez, Ion Stoica, Xuezhe Ma, Hao Zhang.
NeurIPS 2023 Workshop on Instruction Tuning and Instruction Following
On Optimal Caching and Model Multiplexing for Large Model Inference
Banghua Zhu, Ying Sheng, Lianmin Zheng, Clark Barrett, Michael I. Jordan, Jiantao Jiao.
NeurIPS 2023.
AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving
Zhuohan Li*, Lianmin Zheng*, Yinmin Zhong*, Vincent Liu, Ying Sheng, Xin Jin, Yanping Huang, Zhifeng Chen, Hao Zhang, Joseph E. Gonzalez, Ion Stoica.
OSDI 2023.
Read-once refutations in Horn constraint systems: An algorithmic approach
K. Subramani, Piotr Wojciechowski, Ying Sheng.
Journal of Logic and Computation (JLC), 2022.
Politeness and Stable Infiniteness: Stronger Together
Ying Sheng, Yoni Zohar, Christophe Ringeissen, Andrew Reynolds, Clark Barrett, Cesare Tinelli.
International Conference on Automated Deduction (CADE), 2021.
19th International Workshop on Satisfiability Modulo Theories (SMT 2021), Affiliated with CAV 2021.
Subspace Embedding and Linear Regression with Orlicz Norm
(alphabetically) Alexander Andoni, Chengyu Lin, Ying Sheng, Peilin Zhong, Ruiqi Zhong.
International Conference on Machine Learning (ICML), 2018. (Long Talk)
On the Approximation of Nash Equilibria in Sparse Win-Lose Games
Zhengyang Liu, Ying Sheng.
Association for the Advancement of Artificial Intelligence (AAAI), 2018.