Email: xingang2 at illinois dot edu
I am a Ph.D. student in the Department of Electrical and Computer Engineering at the University of Illinois at Urbana-Champaign, advised by Prof. Bin Hu. I also received my master's degree in Electrical and Computer Engineering within the CEMSE Division at King Abdullah University of Science and Technology under the supervision of Prof. Meriem.
My current research interests lie in large language models, reinforcement learning, and optimization, focusing on evaluation benchmarks and post-training techniques.
10/2025: We released VisualToolBench, this is the first multi-modal tool-use benchmark to evaluate models think with image ability.
09/2025: Thrilled to share that our paper EngDesign has been accepted to the NeurIPS 2025 Datasets & Benchmarks Track, see you at San Diego!
08/2025: I joined Anuttacon as an applied LLM research scientist intern.
05/2025: I joined ScaleAI as a research intern this summer.
10/2024: Introducing DynaMath: a dynamic visual math benchmark for evaluating mathematical reasoning robustness of vision-language models. DynaMath transforms 501 seed questions (each represented as a Python program) into limitless concrete problems to test the VLMs robustness. We show a significant performance drop when testing VLMs among different variants of DynaMath. [project] [github] [paper]
10/2024: Announcing the release of our paper "ControlAgent: Automating Control System Design via Novel Integration of LLM Agents and Domain Expertise". In this work, we present a fully automated controller design system that leverages the power of large language models (LLMs) and domain expertise. ControlAgent is able to emulate the iterative design process of a control engineer and improve its previous designs gradually. [paper] [github]
07/2024: I joined Meta Reality Labs as a Research Scientist Intern.
05/2024: Our COLD-Attack paper has been accepted to ICML 2024. This marks my first publication on Large Language Models (LLMs)!
04/2024: Our work "Capabilities of Large Language Models in Control Engineering: A Benchmark Study on GPT-4, Claude 3 Opus, and Gemini 1.0 Ultra" is online now. In this work, we introduce the first college-level Control system problem-solving Benchmark (ControlBench) torwards assessing the capabilities of the leading LLMs. [Paper] [Website]
03/2024: I am honored to receive the Hong, McCully, and Allen Fellowship from the Department of Electrical and Computer Engineering at the University of Illinois at Urbana-Champaign (UIUC).
01/2024: Our paper "COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability" has been posted on online (joint work with Fangxu Yu, Huan Zhang, Lianhui Qin, and Bin Hu). In this paper, we develop the COLD-Attack framework that connects two research area: controllable text generation in NLP and (controllable) LLM jailbreaking in AI safety. [Paper] [Website] [Code]
12/2024: Our paper "Model-Free μ-Synthesis: A Nonsmooth Optimization Perspective" has been posted online. [Paper]
09/2023: Our paper Complexity of Derivative-Free Policy Optimization for Structured H-infinity Control has been accepted at NeurIPS 2023. In this work, we provide the first sample complexity results of derivative-free policy optimization for the H-infinity control problem. [Paper]
04/2023: I passed my Ph.D. preliminary examination! Grateful to my advisor Prof. Bin Hu and committee members Prof. Tamer Başar, Prof. Srikant Rayadurgam, Prof. Geir E Dullerud, and Prof. Jeff Shamma for their support.
03/2023: I am giving an invited talk on the direct policy search for state-feedback H-infinity robust control synthesis at UCSD SOC Lab.
09/2022: Our paper "Global Convergence of Direct Policy Search for State-Feedback H-infinity Robust Control: A Revisit of Nonsmooth Synthesis with Goldstein Subdifferential" has been accepted by NeurIPS 2022. [Paper] [Code]
04/2022: A new paper on the Exact Formulas for Finite-Time Estimation Errors of Decentralized Temporal Difference Learning with Linear Function Approximation has been posted online. [Paper]
02/2022: Our paper on convex programs and Lyapunov function for reinforcement learning has been accepted to ACC 2022. [Paper]
05/2021: I have passed my Ph.D. qualifying exam @ UIUC!
Guo, X., Utkarsh Tyagi, Advait Gosai, Paula Vergara, Ernesto Gabriel Hernández Montoya, Chen Bo Calvin Zhang, Bin Hu, Yunzhong He, Bing Liu, Rakshith Sharma Srinivasa. Beyond Seeing: Evaluating Multimodal LLMs on Tool-Enabled Image Perception, Transformation, and Reasoning. [Paper][Leaderboard]
Guo, X., Li Y., Kong X., et al. Toward Engineering AGI: Benchmarking the Engineering Design Capabilities of LLMs. NeurIPS, 2025. [Paper][Project Page]
Zou, C.*, Guo, X.*, Yang, R.*, Zhang, J., Hu, B. and Zhang, H., 2024. Dynamath: A dynamic visual benchmark for evaluating mathematical reasoning robustness of vision language models. ICRL, 2025 [Paper][Project Page].
Syed, U., Light, E., Guo, X., Zhang, H., Qin, L., Ouyang, Y. and Hu, B., Benchmarking the Capabilities of Large Language Models in Transportation System Engineering: Accuracy, Consistency, and Reasoning Behaviors. arXiv preprint arXiv:2408.08302. [Paper] [Project Page]
Keivan, D., Guo, X., Seiler, P., Dullerud, G., and Hu, B. (2024). Model-Free μ-Synthesis: A Nonsmooth Optimization Perspective. arXiv preprint arXiv:2402.11654.
Guo, X., Yu, F., Zhang, H., Qin, L., and Hu, B. (2024). Cold-attack: Jailbreaking LLMs with stealthiness and controllability, ICML, 2024.
Guo, X., Keivan, D., Dullerud, G., Seiler, P., and Hu, B., 2023. Complexity of Derivative-Free Policy Optimization for Structured H-Infinity Control, NeurIPS 2023.
Guo, X. and Hu, B., 2022. Global Convergence of Direct Policy Search for State-Feedback Hinf Robust Control: A Revisit of Nonsmooth Synthesis with Goldstein Subdifferential. NeurIPS 2022.
Guo, X. and Hu, B., 2022. Exact Formulas for Finite-Time Estimation Errors of Decentralized Temporal Difference Learning with Linear Function Approximation. arXiv preprint arXiv:2204.09801.
Guo, X. and Hu, B., 2022. Convex Programs and Lyapunov Functions for Reinforcement Learning: A Unified Perspective on the Analysis of Value-Based Methods. arXiv preprint arXiv:2202.06922.
Guo, X., P.Y. Hong. and Laleg-Kirati, T.M., 2021. Calibration and validation for a real-time membrane bioreactor: A sliding window approach. Journal of Process Control.
Guo, X., Albalawi, F. and Laleg-Kirati, T.M., 2020. Observer-based Economic Model Predictive Control for Direct Contact Membrane Distillation. Chemical Engineering Research and Design.
Guo, X., Albalawi, F., N'Doye, I. and Laleg-Kirati, T.M. State Estimation in Direct Contact Membrane Distillation based Desalination Using Nonlinear Observer. Control Methods for Water Resource Systems IFAC, 2019.
Guo, X., Albalawi, F. and Laleg, M., 2019, July. Model Predictive Control Paradigms for Direct Contact Membrane Desalination Modeled by Differential Algebraic Equations. In 2019 American Control Conference (ACC) (pp. 5595-5601). IEEE.
Albalawi, F., Chahid, A., Guo, X., Albaradei, S., Magana-Mora, A., Jankovic, B.R., Uludag, M., Van Neste, C., Essack, M., Laleg-Kirati, T.M. and Bajic, V.B., 2019. Hybrid model for efficient prediction of poly (A) signals in human genomic DNA. Methods.
Guo, X., 2019. Model Predictive Control and State Estimation for Membrane-based Water Systems (Master thesis).
Al-Alwan, A., Guo, X., N'Doye, I. and Laleg-Kirati, T.M., 2017, August. Laser beam pointing and stabilization by fractional-order PID control: Tuning rule and experiments. In 2017 IEEE Conference on Control Technology and Applications (CCTA)(pp. 1685-1691). IEEE.
Hong, McCully, and Allen Fellowship, ECE UIUC, 2024-2025
Mavis Future Faculty Fellowship (MF3), UIUC, 2023
Student Travel Award: ACC, NeruIPS, MWCGT
National Scholarship, 2014, 2015 (Top 1%)
Grand Prize of Siemens Cup Intelligent Manufacturing Challenge (Top 3)
Journal Reviewer: IEEE Transactions on Automatic Control, Automatica, IEEE System & Control Letters, Journal of Optimization Theory and Applications, International Journal of Robust and Nonlinear Control, Journal of Hazardous Materials,
Conference Reviewer: NeruIPS, ICML, ICLR, ACC, CDC, IFAC WC