Scalable and Resilient Advanced Privacy-Preserving Federated Learning

Mission of the project

The mission of SR-APPFL is to provide a scalable, resilient federated learning modeling and simulation environment that empowers scientific machine learning researchers to design, test, and evaluate complex distributed workflows before real-world deployment.

By delivering a robust platform tailored to federated learning applications, SR-APPFL ensures that large-scale federated learning studies can be prototyped and validated efficiently on critical research infrastructures.

FedDES

A discrete event-driven simulation framework that accurately models federated learning workflows, enabling researchers to evaluate performance and scalability before real-world deployment.

FedVDI

A virtual distributed infrastructure that provisions and manages federated learning environments on demand, reducing setup overhead and accelerating experimentation across diverse hardware.

FedSZ

An SZ-based data compression module for federated learning that minimizes communication overhead while preserving model accuracy, making large-scale collaborations more bandwidth-efficient.

FedFT

A suite of fault tolerance techniques designed to detect and recover from failures during federated learning, ensuring robustness and reliability in heterogeneous HPC and Edge Computing environments.

Selected Publications

[SEC'25] FedDES: Discrete Event Performance Simulation for Large-Scale Federated Learning Systems
Zhonghao Chen, Weicong Chen, Duo Zhang, Kibaek Kim, Guanpeng Li, Sheng Di, and Xiaoyi Lu
Proceedings of the ACM/IEEE Symposium on Edge Computing, 2025.
[Paper]
[TPDS'25] FedEFsz: Fair Cross-Silo Federated Learning System with Error-Bounded Lossy Compression
Zhaorui Zhang, Sheng Di, Benben Liu, Zhuoran Ji, Guanpeng Li, Xiaoyi Lu, Amelie Chi Zhou, Khalid Ayed Alharthi, Jiannong Cao
IEEE Transactions on Parallel and Distributed Systems
[Paper]
[SC'25] HPC-R1: Characterizing R1-like Large Reasoning Models on HPC
Adam Weingram, Duo Zhang, Zhonghao Chen, Hao Qi, and Xiaoyi Lu
Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis, 2025.
[Paper]
[SC'25] DPAR: High-Performance, Secure, and Scalable Differential Privacy-based AllReduce
Hao Qi, Weicong Chen, Chenghong Wang, and Xiaoyi Lu
Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis, 2025.
[Paper]
[SC'25] GPU Lossy Compression for HPC Can Be Versatile and Ultra-Fast
Yafan Huang, Sheng Di, Guanpeng Li, Franck Cappello
Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis, 2025.
[Paper]
[SC'25] lsCOMP: Efficient Light Source Compression
Yafan Huang, Sheng Di, Robert Underwood, Peco Myint, Miaoqi Chu, Guanpeng Li, Nicholas Schwarz, Franck Cappello
Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis, 2025.
[Paper]
[HPDC'25] DPU-KV: On the Benefits of DPU Offloading for In-Memory Key-Value Stores at the Edge
Arjun Kashyap, Yuke Li, and Xiaoyi Lu
In Proceedings of International ACM Symposium on High Performance and Distributed Computing (HPDC), 2025.
[Paper]
[ICS'25] Understanding the Idiosyncrasies of Emerging BlueField DPUs
Arjun Kashyap, Yuke Li, Darren Ng, and Xiaoyi Lu
In Proceedings of the 39th International Conference on Supercomputing (ICS), 2025.
[Paper]
[SC'24] hZCCL: Accelerating Collective Communication with Co-Designed Homomorphic Compression
Jiajun Huang, Sheng Di, Xiaodong Yu, Yujia Zhai, Jinyang Liu, Zizhe Jian, Xin Liang, Kai Zhao, Xiaoyi Lu, Zizhong Chen, Franck Cappello, Yanfei Guo, Rajeev Thakur
Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis, 2024
[Paper]
[SC'24] Versatile Datapath Soft Error Detection on the Cheap for HPC Applications
Yafan Huang, Sheng Di, Zhaorui Zhang, Xiaoyi Lu, Guanpeng Li
Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis, 2024
[Paper]
[IPDPS'24] NVMe-oPF: Designing Efficient Priority Schemes for NVMe-over-Fabrics with Multi-Tenancy Support
Darren Ng, Andrew Lin, Arjun Kashyap, Guanpeng Li, Xiaoyi Lu
Proceedings of the 38th IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2024
[Paper]
[IPDPS'24] Accelerating Lossy and Lossless Compression on Emerging BlueField DPU Architectures
Yuke Li, Arjun Kashyap, Weicong Chen, Yanfei Guo, Xiaoyi Lu
Proceedings of the 38th IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2024 (Best Paper Award Nomination)
[Paper]

Acknowledgments

This work is primarily supported by DOE Research Grant DE-SC0024207.

We sincerely thank the University of California, Merced, for its support.

We sincerely thank the Argonne National Laboratory, for its support.

We sincerely thank the University of Iowa, for its support.

Questions?

Contact xiaoyi (dot) lu (at) ucmerced.edu, or click the button below to get more information about the project and how you can get involved.

Page updated

Google Sites

Report abuse