Xiangzhe Xu

Xiangzhe (Alex) Xu
Computer Science
Purdue University

xzx@purdue.edu

News

Publications

Experience

Awards

Invited Talks & Lectures

Services

I'm Xiangzhe (Alex) Xu. I'm a Ph.D. student at Purdue University advised by Prof. Xiangyu Zhang. Before that, I obtained my B.Eng. from Nanjing University.

I believe the future of code intelligence lies in the synergy between cutting-edge machine learning innovations and deep domain-specific insights.

My research is dedicated to architecting the next generation of trustworthy code models by infusing domain knowledge into every stage—from data synthesis [ProSec] and data quality enhancement [DiEmph] to revolutionary architecture design [CodeArt] and pre-training techniques [Nova, ReSym].
Beyond these foundational models, I am passionate about creating intelligent, agent-driven code reasoning systems. By leveraging advanced strategies in task decomposition [LLMDFA], hallucination mitigation [LLMSAN], and repo-level analysis [RepoAudit], along with domain-specific alignment [GenNm] and multi-modal reasoning that bridges executable code, source code, and natural language [ProRec], my work aspires to empower machines to understand and analyze code with better accuracy and contextual awareness.
Moreover, I extend these innovations to traditional program analysis, where I develop robust formal verification [CompCertELF, CSLED], innovative semantic formulation [PEM, StateLifter, Arcturus], advanced debugging methodologies [CPC, ParDiff], and precise root cause analysis [ROCAS].

Through this blend of theoretical rigor and practical application, I am excited to push the boundaries of software engineering and code language modeling, paving the way for smarter, more reliable development & quality assurance tools in tomorrow’s software/AI-ware landscape.

News

[07/25] 🎖 Our team PurCL won 1st Place in Amazon Nova AI Challenge

Here's our code: https://github.com/PurCL/ASTRA

[06/25 - 09/25] I am interning at Microsoft Research@Redmond this summer

Publications

Trustworthy Code Model

ProSec: Fortifying Code LLMs with Proactive Security Alignment

Xiangzhe Xu*, Zian Su*, Jinyao Guo, Kaiyuan Zhang, Zhenting Wang, Xiangyu Zhang. ICML'2025. PDF

DiEmph: Improving Binary Code Similarity Transformer Models by Semantics-Driven Instruction Deemphasis

Xiangzhe Xu, Shiwei Feng, Yapeng Ye, Guangyu Shen, Zian Su, Siyuan Cheng, Guanhong Tao, Qingkai Shi, Zhuo Zhang, and Xiangyu Zhang. ISSTA’2023. PDF

CodeArt: Better Code Models by Attention Regularization When Symbols Are Lacking

Zian Su, Xiangzhe Xu, Ziyang Huang, Zhuo Zhang, Yapeng Ye, Jianjun Huang, Xiangyu Zhang. FSE’2024. PDF

ReSym: Harnessing LLMs to Recover Variable and Data Structure Symbols from Stripped Binaries

Danning Xie, Zhuo Zhang, Nan Jiang, Xiangzhe Xu, Lin Tan, and Xiangyu Zhang. CCS'2024. 🎖 ACM SIGSAC Distinguished Paper Award PDF.

Nova: Generative Language Models for Assembly Code with Hierarchical Attention and Contrastive Learning

Nan Jiang, Chengxiao Wang, Kevin Liu, Xiangzhe Xu, Lin Tan, Xiangyu Zhang, Petr Babkin. ICLR'2025. PDF.

Agentic Code Reasoning System

GenNm: Symbol Preference Aware Generative Models for Recovering Variable Names from Stripped Binary

Xiangzhe Xu, Zhuo Zhang, Zian Su, Ziyang Huang, Shiwei Feng, Yapeng Ye, Nan Jiang, Danning Xie, Siyuan Cheng, Lin Tan, Xiangyu Zhang. NDSS'2025. PDF

RepoAudit: An Autonomous LLM-Agent for Repository-Level Code Auditing

Jinyao Guo, Chengpeng Wang, Xiangzhe Xu, Zian Su, Xiangyu Zhang. Arxiv. PDF

LLMDFA:Analyzing Dataflow in Code with Large Language Models

Chengpeng Wang, Wuqi Zhang, Zian Su, Xiangzhe Xu, Xiaoheng Xie, Xiangyu Zhang. NeurIPS’2024. PDF

LLMSAN: Sanitizing Large Language Models in Bug Detection with Data-Flow

Chengpeng Wang, Wuqi Zhang, Zian Su, Xiangzhe Xu, Xiangyu Zhang. EMNLP’2024. PDF

ProRec: Source Code Foundation Models are Transferable Binary Analysis Knowledge Bases

Zian Su, Xiangzhe Xu, Ziyang Huang, Kaiyuan Zhang, Xiangyu Zhang. NeurIPS'2024. PDF

Program Analysis

ROCAS: Root Cause Analysis of Autonomous Driving Accidents via Cyber-Physical Co-mutation

Shiwei Feng, Yapeng Ye, Qingkai Shi, Zhiyuan Cheng, Xiangzhe Xu, Siyuan Cheng, Hongjun Choi, Xiangyu Zhang. ASE'2024. 🎖 ACM SIGSOFT Distinguished Paper Award PDF

ParDiff: Practical Static Differential Analysis of Network Protocol Parsers

Mingwei Zheng, Qingkai Shi, Xuwei Liu, Xiangzhe Xu, Le Yu, Congyu Liu, Guannan Wei, Xiangyu Zhang. OOPSLA'2024. 🎖 ACM SIGPLAN Distinguished Paper Award PDF

PEM: Representing Binary Program Semantics for Similarity Analysis via A Probabilistic Execution Model

Xiangzhe Xu*, Zhou Xuan*, Shiwei Feng, Siyuan Cheng, Yapeng Ye, Qingkai Shi, Guanhong Tao, Le Yu, Zhuo Zhang, Xiangyu Zhang. FSE’2023. PDF

StateLifter: Extracting Protocol Format as State Machine via Controlled Static Loop Analysis

Qingkai Shi, Xiangzhe Xu, Xiangyu Zhang. USENIX Security’2023. PDF

ARCTURUS: Full Coverage Binary Similarity Analysis with Reachability-guided Emulation

Anshunkang Zhou, Yikun Hu, Xiangzhe Xu, Charles Zhang. TOSEM'2023. PDF

CSLED: Automatic Generation and Validation of Instruction Encoders and Decoders

Xiangzhe Xu, Jinhua Wu, Yuting Wang*, Zhenguo Yin and Pengfei Li. CAV’2021. PDF

CompCertELF: Verified Separate Compilation of C Programs into ELF Object Files

Yuting Wang, Xiangzhe Xu, Pierre Wilke, Zhong Shao. OOPSLA’2020. PDF

CPC: Automatically Classifying and Propagating Natural Language Comments via Program Analysis

Juan Zhai, Xiangzhe Xu, Yu Shi, Guanhong Tao, Minxue Pan, Shiqing Ma, Lei Xu, Weifeng Zhang, Lin Tan, Xiangyu Zhang. ICSE'2020. PDF

Experience

Jun. 2025 – Sep. 2025, Research intern at Microsoft Research. Advisor: Qianhui Wu, Hamidreza Saghir, Marc-Alexandre Côté, Tong Wang, Kiran Lakkaraju, Michael Albada. Microsoft Research
Apr. 2021 – Aug. 2021, Research assistant on binary program analysis. Advisor: Charles Zhang. HKUST
Sep. 2020 – Feb. 2021, Research assistant on program verification. Advisor: Yuting Wang. SJTU
May 2020 – Aug. 2020, Intern on automatic differentiation. Advisor: Hao Chen. ByteDance AI Lab
Dec. 2019 – Mar. 2020, Research assistant on program analysis. Advisor: Xiangyu Zhang. Purdue University
Jul. 2019 – Oct. 2019, Research intern on program verification. Advisor: Zhong Shao. Yale University
Jul. 2018 – Jun. 2019, Research intern on program analysis. Advisor: Minxue Pan, Juan Zhai. Nanjing University

Awards

1st Place in Amazon Nova AI Challenge ($250,000), 2025
Amazon Trusted AI Challenge Research Grant ($250,000), 2024
1st Place in AutoDriving CTF at DEFCON30 (from 110 global teams), 2022

Invited Talks & Lectures

Building trustworthy AI coding systems through agentic red-teaming, May 2025, TrustNLP
Scaling security expertise with AI-driven systems, Apr 2025, RIT; Mar 2025, Microsoft
Harnessing domain expertise to elevate post-training data quality, Mar 2025, Meta
Understanding programs when symbols are lacking, Nov 2024, UMass Amherst
An agentic red-teaming framework, Nov 2024, Amazon
Inference time scaling for code reasoning task, Oct 2024, Purdue University
Incorporating program analysis insights to code models, Apr 2024, UMass Amherst
Introduction to code language models, Nov 2023, Purdue University

Services

Reviewer

ARR 2025 Feb, 2025 May
NeurIPS 2025
EXPlainable and REliable Software Systems (EXPRESS) 2025
ACM Transactions on Software Engineering and Methodology(TOSEM)
IEEE Internet of Things Journal (IoTJ)
The Computer Journal (COMPJ)

Sub-Reviewer

International Symposium on the Foundations of Software Engineering (FSE), 2020
International Conference on Automated Software Engineering (ASE), 2023,2024
International Conference on AI Engineering – Software Engineering for AI (ICSE-CAIN), 2022,2023,2024
ACM Conference on Computer and Communications Security (CCS), 2022,2023,2024
International Conference on Software Engineering (ICSE), 2022,2023
International Symposium on Software Testing and Analysis (ISSTA),2020,2024

Artifact Evaluation Committee

ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2024
International Symposium on Software Testing and Analysis (ISSTA),2024
IEEE/ACM International Symposium on Code Generation and Optimization (CGO), 2024,2025
ACM Conference on Computer and Communications Security (CCS), 2023
Static Analysis Symposium (SAS), 2025
Object-oriented Programming, Systems, Languages, and Applications (OOPSLA), 2025

Other Services

The 42nd International Conference on Software Engineering(ICSE’20) Track Scheduling co-Chair
Faculty Search Representative in Purdue Computer Science Graduate Student Association (2023–2024)

Page updated

Google Sites

Report abuse