Yang, H., Yin, H., Shen, M., Molchanov, P., Li, H., & Kautz, J. (2023). Global Vision Transformer Pruning with Hessian-Aware Saliency. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 18547-18557). (Paper)
Yang, H.*, Liu, Y.*, Dong, Z., Keutzer, K., Du, L., & Zhang, S. (2022). NoisyQuant: Noisy Bias-Enhanced Post-Training Activation Quantization for Vision Transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 20321-20330). (Paper)
Yang, H.*, Xiao, L.*, Dong, Z., Keutzer, K., Du, L., & Zhang, S. (2022). CSQ: Growing mixed-precision quantization scheme with bi-level continuous sparsification. In 2023 60th ACM/IEEE Design Automation Conference (DAC) (pp. 1-6). IEEE. (Paper)
Yang, H.*, Yang, X.*, Gong, N. Z., & Chen, Y. (2022). HERO: Hessian-Enhanced Robust Optimization for Unifying and Improving Generalization and Quantization Performance. In Proceedings of the 59th Annual Design Automation Conference. (Paper, Code)
Yang, H., Duan, L., & Li, H. (2021). BSQ: Exploring Bit-Level Sparsity for Mixed-Precision Neural Network Quantization. In International Conference on Learning Representations. (Paper, Code)
Yang, H., Zhang, J., Dong, H., … & Li, H. (2020). DVERGE: Diversifying Vulnerabilities for Enhanced Robust Generation of Ensembles. In Advances in neural information processing systems. (Oral) (Paper, Code)
Li, A., Duan, Y., Yang, H., Chen, Y., & Yang, J. (2020). TIPRDC: Task-Independent Privacy-Respecting Data Crowdsourcing Framework for Deep Learning with Anonymized Intermediate Representations. In Proceedings of the 26th ACM SIGKDD (pp. 824-832). (Best student paper) (Paper, Code)
Yang, H., Wen, W., & Li, H. (2020). DeepHoyer: Learning Sparser Neural Network with Differentiable Scale-Invariant Sparsity Measures. In International Conference on Learning Representations. (Paper, Code)
* Equal contributions.
Most up-to-date publication list available at https://scholar.google.com/citations?user=bjNCUt8AAAAJ
Conference and Workshop Proceedings
Wang, D., Liu, Z., Wang, S., Ren, Y., Deng, J., Hu, J., ... & Yang, H. (2025). FIER: Fine-Grained and Efficient KV Cache Retrieval for Long-context LLM Inference. In EMNLP 2025 Findings.
Gopal, B., Yang, H., Horton, M., & Chen, Y. (2025). SAFER: Sharpness Aware layer-selective Finetuning for Enhanced Robustness in vision transformers. In ICCV 2025.
Han, S., Yoon, S., Kim, J., Wang, D., Jeon, K. E., Yang, H., & Ko, J. H. (2025). MSQ: Memory-Efficient Bit Sparsification Quantization. In ICCV 2025.
Gopal, B., Yang, H., Zhang, J., Horton, M., & Chen, Y. (2025) Boosting Adversarial Robustness with CLAT: Criticality Leveraged Adversarial Training. In Forty-second International Conference on Machine Learning.
Wang, D., & Yang, H. (2025). Taming Sensitive Weights: Noise Perturbation Fine-tuning for Robust LLM Quantization. In Conference on Parsimony and Learning (CPAL).
Liu, Y.*, Yang, H.*, Chen, Y., Zhang, R., Wang, M., Du, Y., & Du, L. (2025). PAT: Pruning-Aware Tuning for Large Language Models. In AAAI Conference on Artificial Intelligence (AAAI-25).
Zhang, R., Cai, Z., Yang, H., Liu, Z., Gudovskiy, D., Okuno, T., ... & Zhang, S. (2024, October). VeCAF: Vision-language Collaborative Active Finetuning with Training Objective Awareness. In Proceedings of the 32nd ACM International Conference on Multimedia (pp. 5451-5459).
Lu, H., Liu, X., Zhou, Y., Li, Q., Keutzer, K., Mahoney, M. W., ... Yang, H. & Yang, Y. Sharpness-diversity tradeoff: improving flat ensembles with SharpBalance. In The Thirty-eighth Annual Conference on Neural Information Processing Systems.
Yang, H.*, Chen, A.*, Gan, Y., Gudovskiy, D., … & Keutzer, K. (2024). Split-Ensemble: Efficient OOD-aware Ensemble via Task and Model Splitting. In International Conference on Machine Learning. PMLR.
Yang, H., Huang, Y., Dong, Z., … & Zhang, S. (2024). Fisher-aware Quantization for DETR Detectors with Critical-category Objectives. In ICML'24 Workshop on Advancing Neural Network Training (WANT).
Huang, Q., Yang, H., Zeng, E., & Chen, Y. (2024). A Deep-Learning-Based Multi-modal ECG and PCG Processing Framework for Label Efficient Heart Sound Segmentation. In IEEE/ACM CHASE.
Zhang, R., Luo, Y., Liu, J., Yang, H., Dong, Z., … & Zhang, S. (2024). Efficient Deweahter Mixture-of-Experts with Uncertainty-Aware Feature-wise Linear Modulation. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 38, No. 15, pp. 16812-16820).
Yang, H., Yin, H., Shen, M., Molchanov, P., Li, H., & Kautz, J. (2023). Global Vision Transformer Pruning with Hessian-Aware Saliency. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 18547-18557).
Yang, H.*, Liu, Y.*, Dong, Z., Keutzer, K., Du, L., & Zhang, S. (2023). NoisyQuant: Noisy Bias-Enhanced Post-Training Activation Quantization for Vision Transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 20321-20330).
Li, X., Liu, Y., Lian, L., Yang, H., Dong, Z., Kang, D., ... & Keutzer, K. (2023). Q-diffusion: Quantizing diffusion models. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 17535-17545).
Zhang, Y., Dong, Z., Yang, H., Lu, M., Tseng, C. C., Du, Y., ... & Zhang, S. (2023). QD-BEV: Quantization-aware View-guided Distillation for Multi-view 3D Object Detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 3825-3835).
Yang, H.*, Xiao, L.*, Dong, Z., Keutzer, K., Du, L., & Zhang, S. (2023). CSQ: Growing mixed-precision quantization scheme with bi-level continuous sparsification. In 2023 60th ACM/IEEE Design Automation Conference (DAC) (pp. 1-6). IEEE.
Yang, X., Yang, H., Zhang, J., Li, H. H., & Chen, Y. (2022). On Building Efficient and Robust Neural Network Designs. In 2022 56th Asilomar Conference on Signals, Systems, and Computers (pp. 317-321). IEEE.
Yang, H., Yang, X., Gong, N. Z., & Chen, Y. (2022). HERO: Hessian-Enhanced Robust Optimization for Unifying and Improving Generalization and Quantization Performance. In Proceedings of the 59th Annual Design Automation Conference (pp. 25-30).
Yang, H., Duan, L., & Li, H. (2021). BSQ: Exploring Bit-Level Sparsity for Mixed-Precision Neural Network Quantization. In International Conference on Learning Representations.
Chen, Y., Li, A., Yang, H., Zhang, T., Yang, Y., Li, H., ... & Pajic, M. (2021). AI-Powered IoT System at the Edge. In 2021 IEEE Third International Conference on Cognitive Machine Intelligence (CogMI) (pp. 242-251). IEEE.
Yang, X., Belakaria, S., Joardar, B. K., Yang, H., Doppa, J. R., Pande, P. P., ... & Li, H. H. (2021, November). Multi-objective optimization of ReRAM crossbars for robust DNN inferencing under stochastic noise. In 2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD) (pp. 1-9). IEEE.
Xie, Z., Xu, X., Walker, M., Knebel, J., Palaniswamy, K., Hebert, N., Hu, J., Yang, H., ... & Das, S. (2021, October). APOLLO: An automated power modeling framework for runtime power introspection in high-volume commercial microprocessors. In MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture (pp. 1-14).
Zhang, J., Huang, Y., Yang, H., Martinez, M., Hickman, G., Krolik, J., & Li, H. (2021, June). Efficient fpga implementation of a convolutional neural network for radar signal processing. In 2021 IEEE 3rd International Conference on Artificial Intelligence Circuits and Systems (AICAS) (pp. 1-4). IEEE.
Li, A., Guo, J., Yang, H., Salim, F. D., & Chen, Y. (2021, May). Deepobfuscator: Obfuscating intermediate representations with privacy-preserving adversarial learning on smartphones. In Proceedings of the International Conference on Internet-of-Things Design and Implementation (pp. 28-39).
Inkawhich, N., Liang, K. J., Zhang, J., Yang, H., Li, H., & Chen, Y. (2021). Can Targeted Adversarial Examples Transfer When the Source and Target Models Have No Label Space Overlap?. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (pp. 41-50).
Sun, J., Li, A., Wang, B., Yang, H., Li, H., & Chen, Y. (2021). Soteria: Provable defense against privacy leakage in federated learning from representation perspective. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 9311-9319).
Yang, H., Zhang, J., Dong, H., … & Li, H. (2020). DVERGE: Diversifying Vulnerabilities for Enhanced Robust Generation of Ensembles. In Advances in Neural Information Processing Systems, 33, 5505-5515.
Li, A., Duan, Y., Yang, H., Chen, Y., & Yang, J. (2020). TIPRDC: Task-Independent Privacy-Respecting Data Crowdsourcing Framework for Deep Learning with Anonymized Intermediate Representations. In Proceedings of the 26th ACM SIGKDD (pp. 824-832).
Yang, H., Tang, M., Wen, W., Yan, F., ... & Chen, Y. (2020). Learning Low-rank Deep Neural Networks via Singular Vector Orthogonality Regularization and Singular Value Sparsification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (pp. 678-679).
Yang, H., Wen, W., & Li, H. (2020). DeepHoyer: Learning Sparser Neural Network with Differentiable Scale-Invariant Sparsity Measures. In International Conference on Learning Representations.
Zhang, J., Yang, H., Chen, F., Wang, Y., & Li, H. (2019). Exploring bit-slice sparsity in deep neural networks for efficient reram-based deployment. In 2019 Fifth Workshop on Energy Efficient Machine Learning and Cognitive Computing-NeurIPS Edition (EMC2-NIPS) (pp. 1-5). IEEE.
Cheng, H. P., Shen, J., Yang, H., Wu, Q., Li, H., & Chen, Y. (2019). Adverquil: an efficient adversarial detection and alleviation technique for black-box neuromorphic computing systems. In Proceedings of the 24th Asia and South Pacific Design Automation Conference (pp. 518-525).
Liu, X., Yang, H., Liu, Z., Song, L., Li, H., & Chen, Y. (2019). Dpatch: An adversarial patch attack on object detectors. In SafeAI 2019.
Nixon, K. W., Mao, J., Shen, J., Yang, H., Li, H. H., & Chen, Y. (2018). Spn dash-fast detection of adversarial attacks on mobile via sensor pattern noise fingerprinting. In 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD) (pp. 1-6). IEEE.
Song, C., Cheng, H. P., Yang, H., Li, S., Wu, C., Wu, Q., ... & Li, H. (2018). MAT: A multi-strength adversarial training method to mitigate adversarial attacks. In 2018 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) (pp. 476-481). IEEE.
Qiao, X., Cao, X., Yang, H., Song, L., & Li, H. (2018). AtomLayer: A universal ReRAM-based CNN accelerator with atomic layer computation. In Proceedings of the 55th Annual Design Automation Conference (pp. 1-6).
Yuan, Z., Yue, J., Yang, H., Wang, Z., Li, J., Yang, Y., ... & Liu, Y. (2018). Sticker: A 0.41-62.1 TOPS/W 8Bit neural network processor with multi-sparsity compatible convolution arrays and online tuning acceleration for fully connected layers. In 2018 IEEE symposium on VLSI circuits (pp. 33-34). IEEE.
Journal Publications
Wu, X., Hanson, E., Wang, N., Zheng, Q., Yang, X., Yang, H., … & Li, H. (2024). Block-Wise Mixed-Precision Quantization: Enabling High Efficiency for Practical ReRAM-based DNN Accelerators. IEEE Transactions on Computer Aided Design of Integrated Circuits & Systems (TCAD)
Yang, X., Yang, H., Doppa, J. R., Pande, P. P., Chakrabarty, K., & Li, H. (2022). Essence: Exploiting structured stochastic gradient pruning for endurance-aware reram-based in-memory training systems. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
Mao, J., Yang, H., Li, A., Li, H., & Chen, Y. (2021). Tprune: Efficient transformer pruning for mobile devices. ACM Transactions on Cyber-Physical Systems, 5(3), 1-22.
Song, C., Cheng, H. P., Yang, H., Li, S., Wu, C., Wu, Q., & Li, H. (2020). Adversarial attack: A new threat to smart devices and how to defend it. IEEE Consumer Electronics Magazine, 9(4), 49-55.
In Submission and Preprints
Fang, H., Liu, Y., Du, Y., Du, L., & Yang, H. (2025). SQAP-VLA: A Synergistic Quantization-Aware Pruning Framework for High-Performance Vision-Language-Action Models. arXiv preprint arXiv:2509.09090.
He, L., Zhen, S., Sun, K., Liu, Y., Zhao, Y., Tan, C., Yang, H., Du, Y., & Du, L. (2025). BASE-Q: Bias and Asymmetric Scaling Enhanced Rotational Quantization for Large Language Models. arXiv preprint arXiv:2506.15689.
Ren, Y., Collins, M. D., Hu, M., & Yang, H. (2025). Is Attention Required for Transformer Inference? Explore Function-preserving Attention Replacement. arXiv preprint arXiv:2505.21535.
Liu, Y., Zhang, R., Yang, H., Zheng, S., Wang, D., Du, Y., Du, L., & Zhang, S. (2024). T-REX: Mixture-of-Rank-One-Experts with Semantic-aware Intuition for Multi-task Large Language Model Finetuning. arXiv preprint arXiv:2404.08985.
Zhang, R., Cheng, A., Luo, Y., Dai, G., Yang, H., Liu, J., ... & Zhang, S. (2024). Decomposing the Neurons: Activation Sparsity via Mixture of Experts for Continual Test Time Adaptation. arXiv preprint arXiv:2405.16486.
Ma, Z., Zhou, D., Yeh, C. H., Wang, X. S., Li, X., Yang, H., ... & Feng, J. (2024). Magic-Me: Identity-Specific Video Customized Diffusion. arXiv preprint arXiv:2402.09368.
Zhang, J., Yang, H., & Li, H. (2023). HCE: Improving performance and efficiency with heterogeneously compressed neural network ensemble. arXiv preprint arXiv:2301.07794.
Books and Book chapters
Li, A., Yang, H., & Chen, Y. (2020). Task-Agnostic Privacy-Preserving Representation Learning via Federated Learning. In Federated Learning (pp. 51-65). Springer, Cham.
Chen, Y., Li, H., & Yang, H. (2023). Computer Engineering Machine Learning and Neural Networks (textbook for Duke ECE 661, in preparation)