Research Interests
The ongoing research projects of the team consist of the current funded projects and some new research themes.
The research of our group lies at the intersection of computational methods and machine learning, with a particular focus on graph- and network-structured data. Our recent interests lie in building Trustworthy AI for graphs and Large Language Models, applying these techniques to high-impact challenges in AI for Science like neural control, and advancing the theoretical foundations of Graph and Geometric Machine Learning. My work aims to bridge fundamental theory with impactful solutions in diverse domains.
The ongoing projects are in the following four research themes:
Theme I: Trustworthy Machine Learning (LLMs Interact with Structured Data Beyond Text)
This theme focuses on enhancing the trustworthiness of machine learning systems that integrate structured data and LLMs, with special emphasis on knowledge representation, reasoning and data privacy.
Structure-guided generation in Large Language Models: Investigating how LLMs can incorporate structured knowledge source to guide their generation. By integrating graph-based knowledge graphs or relational databases, this research explores how LLMs can perform more accurate and contextually grounded reasoning and decision-making.
Graph-KV: Breaking Sequence via Injecting Structural Biases into Large Language Models, (code) submitted 2025
Generalization Principles for Inference over Text-Attributed Graphs with Large Language Models (code) ICML 2025
Rethinking Addressing in Language Models via Contextualized Equivariant Positional Encoding (code) ICML 2025
Simple is Effective: The Roles of Graphs and Large Language Models in Knowledge-Graph-Based Retrieval-Augmented Generation (code) ICLR 2025
Understanding and Mitigating Bottlenecks of State Space Models through the Lens of Recency and Over-smoothing ICLR 2025 (codes)
This is an on-going project in 2025. More works are coming...
Privacy-preserving machine learning: Utilizing advanced techniques such as differential privacy and certified unlearning to ensure the security and confidentiality of sensitive structured data that is used by graph models or large language models. This work aims to protect data integrity in domains like social networks, healthcare, and financial systems, where privacy is critical.
Do LLMs Really Forget? Evaluating Unlearning with Knowledge Correlation and Confidence Awareness, submitted 2025 (code)
Privately Learning from Graphs with Applications in Fine-tuning Large Language Models COLM 2025 (code)
Underestimated Privacy Risks for Minority Populations in Large Language Model Unlearning ICML 2025 (code)
Convergent privacy loss of noisy-sgd without convexity and smoothness ICLR 2025
Differentially Private Graph Diffusion with Applications in Personalized PageRanks, NeurIPS 2024 (code)
Langevin Unlearning: A New Perspective of Noisy Gradient Descent for Machine Unlearning, NeurIPS 2024 (spotlight)
On the Inherent Privacy Properties of Discrete Denoising Diffusion Models. TMLR and selected to be presented at ICLR 2025, WSDAIF 2024 Oral
Differentially Private Decoupled Graph Convolutions for Multigranular Topology Protection, NeurIPS 2023 code
Theme II: AI for Science
This theme addresses the application of advanced AI methods to solve critical challenges in scientific research, particularly in high-energy physics and astrophysics.
AI for particle physics: Applying machine learning techniques to improve particle tracking and denoising in large experiments like the Large Hadron Collider. My work focuses on enhancing the accuracy and efficiency of reconstructing particle trajectories from noisy data.
Towards Understanding Sensitive and Decisive Patterns in Explainable AI: A Case Study of Model Interpretation in Geometric Deep Learning Nature Machine Intelligence, 2025
GeSS: Benchmarking Geometric Deep Learning under Scientific Applications with Distribution Shift (Benchmark Track), NeurIPS, 2024 (codes)
Locality-Sensitive Hashing-Based Efficient Point Transformer with Applications in High-Energy Physics, ICML 2024 (codes, oral)
Interpretable Geometric Deep Learning via Learnable Randomness Injection, ICLR 2023. (codes)
Semi-supervised Graph Neural Network for Particle-level Noise Removal, The European Physics Journal C 2023
Neutrino reconstruction at IceCube: Developing algorithms to optimize the reconstruction of neutrino interactions in the IceCube Neutrino Observatory. By improving the precision of event detection and parameter estimation, this research contributes to a better understanding of neutrino properties and their role in the universe.
Data-driven Multi-task GeV Event Reconstruction in DeepCore IceCube, submitted 2025
Neural Control of Large-scale Scientific Experiments: Developing algorithms to automate and optimize complex, mission-critical systems that operate in dynamic, high-speed environments in large-scale scientific experiments such as accelerator and quantum systems.
This is a new project that starts in 2025. The relevant example paper is to be listed...
Theme III: Foundations of Graph and Geometric Machine Learning
This theme explores the core theoretical and computational challenges in Graph and Geometric Machine Learning.
The expressive power of graph neural networks: Investigating the limits of what GNNs can represent and how their architectures can be optimized for better expressiveness across different tasks such as molecular dynamic simulation and material property prediction.
Polynomial Width is Sufficient for Set Representation with High-dimensional Features, ICLR 2024
Labeling Trick: A Theory of Using Graph Neural Networks for Multi-Node Representation Learning, NeurIPS 2021
"Nested Graph Neural Networks," NeurIPS 2021. (code)
Distance Encoding -- Design Provably More Powerful GNNs for Structural Representation Learning, NeurIPS 2020. (codes)(slides)
The stability and generalization of Graph and Geometric Machine Learning models: Examining the ability of these models to generalize beyond their training data, particularly in cases where the underlying graphs vary in size and structure, and the geometry gets shifted.
What Are Good Positional Encodings for Directed Graphs? ICLR 2025 (codes)
On the Stability of Expressive Positional Encodings for Graphs, ICLR 2024 (codes)
Pairwise Alignment Improves Graph Domain Adaptation, ICML 2024 (codes, spotlight)
Equivariant and Stable Positional Encoding for More Powerful Graph Neural Networks, ICLR 2022. (code)
Structural Re-weighting Improves Graph Domain Adaptation, ICML 2023. (codes)
Graph Information Bottleneck, NeurIPS 2020. (codes) (slides)
Efficient computation of Graph and Geometric Machine Learning models: Developing novel algorithms and optimization techniques to accelerate the computational efficiency of machine learning models to process graph and geometric data. This includes reducing the time and space complexity of training large-scale models, enabling their application to high-dimensional graph data in real-world scenarios.
Locality-Sensitive Hashing-Based Efficient Point Transformer with Applications in High-Energy Physics, ICML 2024 (codes, oral)
Equivariant Hypergraph Diffusion Neural Operators," ICLR 2023. (codes)
Neighborhood-aware Scalable Temporal Network Representation Learning, LoG 2022 (best paper award!) (codes) (talks)
Algorithm and System Co-design for Efficient Subgraph-based Graph Representation Learning, VLDB 2022 (codes)
"Submodular Hypergraph: p-Laplacian, Cheeger Inequalities and Spectral Clustering," ICML 2018