Xiangzhe (Alex) Xu
Computer Science
Purdue University
I'm Xiangzhe (Alex) Xu. I'm a Ph.D. student at Purdue University advised by Prof. Xiangyu Zhang. Before that, I obtained my B.Eng. from Nanjing University.
I believe the future of code intelligence lies in the synergy between cutting-edge machine learning innovations and deep domain-specific insights.
My research is dedicated to architecting the next generation of trustworthy code models by infusing domain knowledge into every stage—from data synthesis [ProSec] and data quality enhancement [DiEmph] to revolutionary architecture design [CodeArt] and pre-training techniques [Nova, ReSym].
Beyond these foundational models, I am passionate about creating intelligent, agent-driven code reasoning systems. By leveraging advanced strategies in task decomposition [LLMDFA], hallucination mitigation [LLMSAN], and repo-level analysis [RepoAudit], along with domain-specific alignment [GenNm] and multi-modal reasoning that bridges executable code, source code, and natural language [ProRec], my work aspires to empower machines to understand and analyze code with better accuracy and contextual awareness.
Moreover, I extend these innovations to traditional program analysis, where I develop robust formal verification [CompCertELF, CSLED], innovative semantic formulation [PEM, StateLifter, Arcturus], advanced debugging methodologies [CPC, ParDiff], and precise root cause analysis [ROCAS].
Through this blend of theoretical rigor and practical application, I am excited to push the boundaries of software engineering and code language modeling, paving the way for smarter, more reliable development & quality assurance tools in tomorrow’s software/AI-ware landscape.
[07/25] 🎖 Our team PurCL won 1st Place in Amazon Nova AI Challenge
Here's our code: https://github.com/PurCL/ASTRA
[06/25 - 09/25] I am interning at Microsoft Research@Redmond this summer
ProSec: Fortifying Code LLMs with Proactive Security Alignment
Xiangzhe Xu*, Zian Su*, Jinyao Guo, Kaiyuan Zhang, Zhenting Wang, Xiangyu Zhang. ICML'2025. PDF
DiEmph: Improving Binary Code Similarity Transformer Models by Semantics-Driven Instruction Deemphasis
Xiangzhe Xu, Shiwei Feng, Yapeng Ye, Guangyu Shen, Zian Su, Siyuan Cheng, Guanhong Tao, Qingkai Shi, Zhuo Zhang, and Xiangyu Zhang. ISSTA’2023. PDF
CodeArt: Better Code Models by Attention Regularization When Symbols Are Lacking
Zian Su, Xiangzhe Xu, Ziyang Huang, Zhuo Zhang, Yapeng Ye, Jianjun Huang, Xiangyu Zhang. FSE’2024. PDF
ReSym: Harnessing LLMs to Recover Variable and Data Structure Symbols from Stripped Binaries
Danning Xie, Zhuo Zhang, Nan Jiang, Xiangzhe Xu, Lin Tan, and Xiangyu Zhang. CCS'2024. 🎖 ACM SIGSAC Distinguished Paper Award PDF.
Nova: Generative Language Models for Assembly Code with Hierarchical Attention and Contrastive Learning
Nan Jiang, Chengxiao Wang, Kevin Liu, Xiangzhe Xu, Lin Tan, Xiangyu Zhang, Petr Babkin. ICLR'2025. PDF.
GenNm: Symbol Preference Aware Generative Models for Recovering Variable Names from Stripped Binary
Xiangzhe Xu, Zhuo Zhang, Zian Su, Ziyang Huang, Shiwei Feng, Yapeng Ye, Nan Jiang, Danning Xie, Siyuan Cheng, Lin Tan, Xiangyu Zhang. NDSS'2025. PDF
RepoAudit: An Autonomous LLM-Agent for Repository-Level Code Auditing
Jinyao Guo, Chengpeng Wang, Xiangzhe Xu, Zian Su, Xiangyu Zhang. Arxiv. PDF
LLMDFA:Analyzing Dataflow in Code with Large Language Models
Chengpeng Wang, Wuqi Zhang, Zian Su, Xiangzhe Xu, Xiaoheng Xie, Xiangyu Zhang. NeurIPS’2024. PDF
LLMSAN: Sanitizing Large Language Models in Bug Detection with Data-Flow
Chengpeng Wang, Wuqi Zhang, Zian Su, Xiangzhe Xu, Xiangyu Zhang. EMNLP’2024. PDF
ProRec: Source Code Foundation Models are Transferable Binary Analysis Knowledge Bases
Zian Su, Xiangzhe Xu, Ziyang Huang, Kaiyuan Zhang, Xiangyu Zhang. NeurIPS'2024. PDF
ROCAS: Root Cause Analysis of Autonomous Driving Accidents via Cyber-Physical Co-mutation
Shiwei Feng, Yapeng Ye, Qingkai Shi, Zhiyuan Cheng, Xiangzhe Xu, Siyuan Cheng, Hongjun Choi, Xiangyu Zhang. ASE'2024. 🎖 ACM SIGSOFT Distinguished Paper Award PDF
ParDiff: Practical Static Differential Analysis of Network Protocol Parsers
Mingwei Zheng, Qingkai Shi, Xuwei Liu, Xiangzhe Xu, Le Yu, Congyu Liu, Guannan Wei, Xiangyu Zhang. OOPSLA'2024. 🎖 ACM SIGPLAN Distinguished Paper Award PDF
PEM: Representing Binary Program Semantics for Similarity Analysis via A Probabilistic Execution Model
Xiangzhe Xu*, Zhou Xuan*, Shiwei Feng, Siyuan Cheng, Yapeng Ye, Qingkai Shi, Guanhong Tao, Le Yu, Zhuo Zhang, Xiangyu Zhang. FSE’2023. PDF
StateLifter: Extracting Protocol Format as State Machine via Controlled Static Loop Analysis
Qingkai Shi, Xiangzhe Xu, Xiangyu Zhang. USENIX Security’2023. PDF
ARCTURUS: Full Coverage Binary Similarity Analysis with Reachability-guided Emulation
Anshunkang Zhou, Yikun Hu, Xiangzhe Xu, Charles Zhang. TOSEM'2023. PDF
CSLED: Automatic Generation and Validation of Instruction Encoders and Decoders
Xiangzhe Xu, Jinhua Wu, Yuting Wang*, Zhenguo Yin and Pengfei Li. CAV’2021. PDF
CompCertELF: Verified Separate Compilation of C Programs into ELF Object Files
Yuting Wang, Xiangzhe Xu, Pierre Wilke, Zhong Shao. OOPSLA’2020. PDF
CPC: Automatically Classifying and Propagating Natural Language Comments via Program Analysis
Juan Zhai, Xiangzhe Xu, Yu Shi, Guanhong Tao, Minxue Pan, Shiqing Ma, Lei Xu, Weifeng Zhang, Lin Tan, Xiangyu Zhang. ICSE'2020. PDF
Jun. 2025 – Sep. 2025, Research intern at Microsoft Research. Advisor: Qianhui Wu, Hamidreza Saghir, Marc-Alexandre Côté, Tong Wang, Kiran Lakkaraju, Michael Albada. Microsoft Research
Apr. 2021 – Aug. 2021, Research assistant on binary program analysis. Advisor: Charles Zhang. HKUST
Sep. 2020 – Feb. 2021, Research assistant on program verification. Advisor: Yuting Wang. SJTU
May 2020 – Aug. 2020, Intern on automatic differentiation. Advisor: Hao Chen. ByteDance AI Lab
Dec. 2019 – Mar. 2020, Research assistant on program analysis. Advisor: Xiangyu Zhang. Purdue University
Jul. 2019 – Oct. 2019, Research intern on program verification. Advisor: Zhong Shao. Yale University
Jul. 2018 – Jun. 2019, Research intern on program analysis. Advisor: Minxue Pan, Juan Zhai. Nanjing University
1st Place in Amazon Nova AI Challenge ($250,000), 2025
Amazon Trusted AI Challenge Research Grant ($250,000), 2024
1st Place in AutoDriving CTF at DEFCON30 (from 110 global teams), 2022
Building trustworthy AI coding systems through agentic red-teaming, May 2025, TrustNLP
Scaling security expertise with AI-driven systems, Apr 2025, RIT; Mar 2025, Microsoft
Harnessing domain expertise to elevate post-training data quality, Mar 2025, Meta
Understanding programs when symbols are lacking, Nov 2024, UMass Amherst
An agentic red-teaming framework, Nov 2024, Amazon
Inference time scaling for code reasoning task, Oct 2024, Purdue University
Incorporating program analysis insights to code models, Apr 2024, UMass Amherst
Introduction to code language models, Nov 2023, Purdue University
ARR 2025 Feb, 2025 May
NeurIPS 2025
EXPlainable and REliable Software Systems (EXPRESS) 2025
ACM Transactions on Software Engineering and Methodology(TOSEM)
IEEE Internet of Things Journal (IoTJ)
The Computer Journal (COMPJ)
International Symposium on the Foundations of Software Engineering (FSE), 2020
International Conference on Automated Software Engineering (ASE), 2023,2024
International Conference on AI Engineering – Software Engineering for AI (ICSE-CAIN), 2022,2023,2024
ACM Conference on Computer and Communications Security (CCS), 2022,2023,2024
International Conference on Software Engineering (ICSE), 2022,2023
International Symposium on Software Testing and Analysis (ISSTA),2020,2024
ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2024
International Symposium on Software Testing and Analysis (ISSTA),2024
IEEE/ACM International Symposium on Code Generation and Optimization (CGO), 2024,2025
ACM Conference on Computer and Communications Security (CCS), 2023
Static Analysis Symposium (SAS), 2025
Object-oriented Programming, Systems, Languages, and Applications (OOPSLA), 2025
The 42nd International Conference on Software Engineering(ICSE’20) Track Scheduling co-Chair
Faculty Search Representative in Purdue Computer Science Graduate Student Association (2023–2024)