Current (2022 ~ present)
System-Efficient Federated Learning (FL) on Heterogeneous Systems
Knowledge distillation for exploiting heterogeneous edge devices in FL
Partial model aggregation for communication-efficient FL
System-aware partial model training for FL on heterogeneous systems
Distributed Optimization Algorithms for Large-Scale Deep Learning
Computation-efficient Sharpness-Aware Minimization (SAM)
Gradient recycling for communication-efficient deep learning
CPU-aided asynchronous optimization for large-scale deep learning
Applied Machine Learning / Deep Learning for Electronic Materials Design and Analysis
Deep learning-based Power Spectral Density (PSD) analysis for estimating the oxygen vacancy distribution in heterostructures.
Past (2015 ~ 2022)
Large-scale communication-efficient distributed learning (@ USC)
Partial model training: theoretical analysis on the impact of partial model training
Layer-wise adaptive model aggregation strategy for scalable Federated Learning
Hessian-aware learning rate adjustment method for large-batch neural network training
Scalable parallelization strategy for deep learning on HPC platforms (@ Northwestern)
Communication-efficient neural network training by exploiting the overlap of computation and communication
Gradient averaging algorithms with a lower communication complexity
Adaptive model update frequency control for scalable deep learning
I/O strategies for large-scale deep learning on HPC platforms (@ Northwestern)
Asynchronous I/O strategy for data feeding in deep learning
Parallel HDF5 data analysisÂ