Systems & Machine Learning

[IISWC 2022] FedGPO: Characterizing and Designing for Efficient Federated Learning using Heterogeneity-Aware Global Parameter Optimization

Young Geun Kim and Carole-Jean Wu.

In Proceedings of the IEEE International Symposium on Workload Characterization (IISWC), 2022.

[RecSys 2022] Towards Fair Federated Recommendation Learning: Characterizing the Inter-Dependence of System and Data Heterogeneity

Kiwan Maeng, Haiyu Lu, Luca Melis, John Nguyen, Mike Rabbat, Carole-Jean Wu.

In Proceedings of the ACM Conference on Recommender Systems (RecSys), 2022.

Best Paper Award Finalist

[ISCA 2022] Understanding Data Storage and Ingestion for Large-Scale Deep Recommendation Model Training

Mark Zhao, Niket Agarwal, Aarti Basant, Bugra Gedik, Satadru Pan, Mustafa Ozdal, Rakesh Komuravelli, Jerry Pan, Tianshu Bao, Haowei Lu, Sundaram Narayanan, Jack Langman, Kevin Wilfong, Harsha Rastogi, Carole-Jean Wu, Christos Kozyrakis, Parik Pol.

In Proceedings of the ACM/IEEE International Symposium on Computer Architecture (ISCA), 2022.

[MLSys 2022] Sustainable AI: Environmental Implications, Challenges and Opportunities

Carole-Jean Wu, Ramya Raghavendra, Udit Gupta, Bilge Acun, Newsha Ardalani, Kiwan Maeng, Fiona Aga Behram, James Huang, Charles Bai, Michael Gschwind, Anurag Gupta, Myle Ott, Anastasia Melnikov, Salvatore Candido, David Brooks, Geeta Chauhan, Benjamin Lee, Hsien-Hsin S. Lee, Bugra Akyildiz, Maximilian Balandat, Joe Spisak, Ravi Jain, Mike Rabbat, Kim Hazelwood.

In Proceedings of the Conference on Machine Learning and Systems (MLSys), 2022.

[MLSys 2022] Papaya: Practical, Private, and Scalable Federated Learning

Dzmitry Huba, John Nguyen, Kshitiz Malik, Ruiyu Zhu, Mike Rabbat, Ashkan Yousefpour, Carole-Jean Wu, Hongyuan Zhan, Pavel Ustinov, Harish Srinivas, Kaikai Wang, Anthony Shoumikhin, Jesik Min, Mani Malek.

In Proceedings of the Conference on Machine Learning and Systems (MLSys), 2022.

[ASPLOS 2022] RecShard: Statistical Feature-Based Memory Optimization for Industry-Scale Neural Recommendation

Geet Sethi, Bilge Acun, Niket Agarwal, Christos Kozyrakis, Caroline Trippel, Carole-Jean Wu.

In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2022.

[HPCA 2022] Hercules: Heterogeneity-Aware Inference Serving for At-Scale Personalized Recommendation

Liu Ke, Udit Gupta, Mark Hempstead, Carole-Jean Wu, Hsien-Hsin Lee, Xuan Zhang.

In Proceedings of the IEEE International Symposium on High Performance Computer Architecture (HPCA), 2022.

[HPCA 2022] SecNDP: Secure Near-Data Processing with Untrusted Memory

W. Xiong, L. Ke, D. Jankov, M. Kounavis, X. Wang, E. Northup, A. Wang, B. Acun, C.-J. Wu, P. Tang, E. Suh, X. Zhang, H.-S. Lee.

In Proceedings of the IEEE International Symposium on High Performance Computer Architecture (HPCA), 2022.

[WSDM 2022] On Sampling Collaborative Filtering Datasets

Noveen Sachdeva, Carole-Jean Wu, and Julian McAuley.

In Proceedings of the ACM Conference on Web Search and Data Mining (WSDM), 2022.

[ASPLOS 2021] RecSSD: Near Data Processing for Solid State Drive Based Recommendation Inference

M. Wilkening, U. Gupta, S. Hsia, C. Trippel, C.-J. Wu, D. Brooks, G.-Y. Wei.

In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2021.

[HPCA 2021] Understanding Training Efficiency of Deep Learning Recommendation Models at Scale

B. Acun, M. Murphy, X. Wang, J. Nie, C.-J. Wu, and K. Hazelwood.

In Proceedings of the IEEE International Symposium on High Performance Computer Architecture (HPCA), 2021.

[MLSys-2021] TT-Rec: Tensor Train Compression for Deep Learning Recommendation Model Embeddings

C. Yin, B. Acun, X. Liu, and C.-J. Wu.

In Proceedings of the Conference on Machine Learning and Systems (MLSys), 2021.

MLSys Outstanding Paper Award

[MLSys-2021] CPR: Understanding and Improving Failure Tolerant Training for Deep Learning Recommendation with Partial Recovery

K. Maeng, S. Bharuka, I. Gao, M. Jeffrey, V. Saraph, B.-Y. Su, C. Trippel, J. Yang, M. Rabbat, B. Lucia, and C.-J. Wu.

In Proceedings of the Conference on Machine Learning and Systems (MLSys), 2021.

[MICRO-2021] AutoFL: Enabling Heterogeneity-Aware Energy Efficient Federated Learning

Y. Kim and C.-J. Wu.

In Proceedings of the IEEE/ACM Symposium on Microarchitecture (MICRO), 2021.

[MICRO-2021] RecPipe: Co-designing Models and Hardware to Jointly Optimize Recommendation Quality and Performance

U. Gupta, S. Hsia, J. Zhang, M. Wilkening, J. Pombra, H.-S. Lee, G. Wei, C.-J. Wu, and D. Brooks.

In Proceedings of the IEEE/ACM Symposium on Microarchitecture (MICRO), 2021.

[MICRO 2020] AutoScale: Energy Efficiency Optimization for Stochastic Edge Inference Using Reinforcement Learning

Young Geun Kim and Carole-Jean Wu.

In Proceedings of the IEEE International Symposium on Microarchitecture (MICRO), Athens, Greece, October 2020.

[HPCA 2020] The Architectural Implications of Facebook’s DNN-based Personalized Recommendation

U. Gupta, C.-J. Wu, X. Wang, M. Naumov, B. Reagen, D. Brooks, B. Cottel, K. Hazelwood, M. Hempstead, B. Jia, H.-H. Lee, A. Malevich, D. Mudigere, M. Smelyanskiy, L. Xiong, X. Zhang.

In Proceedings of the IEEE International Symposium on High Performance Computer Architecture (HPCA), San Diego CA, 2020.

[ISCA 2020] DeepRecSys: A System for Optimizing End-to-end At-scale Neural Recommendation Inference

U. Gupta, S. Hsia, V. Saraph, X. Wang, B. Reagen, G.-Y. Wei, H.-S. Lee, D. Brooks, and C.-J. Wu.

In Proceedings of the ACM/IEEE International Symposium on Computer Architecture (ISCA), Valencia, Spain, 2020.

[ISCA 2020] RecNMP: Accelerating Personalized Recommendation with Near-Memory Processing

L. Ke, U. Gupta, B. Cho, D. Brooks, V. Chandra, U. Diril, A. Firoozshahian, K. Hazelwood, B. Jia, H.-S. Lee, M. Li, B. Maher, D. Mudigere, M. Naumov, M. Schatz, M. Smelyanskiy, X. Wang, B. Reagen, C.-J. Wu, M. Hempstead, X. Zhang.

In Proceedings of the ACM/IEEE International Symposium on Computer Architecture (ISCA), Valencia, Spain, 2020.

[ArXiv 2020] Developing a Recommendation Benchmark for MLPerf Training and Inference

C.-J. Wu, R. Burke, E. Chi, J. Konstan, J. McAuley, Y. Raimond, H. Zhang.

In CoRR abs/2003.07336.

[IEEE Micro 2020] MLPerf: An Industry Standard Benchmark Suite for Machine Learning Performance

P. Mattson, V. Reddi, C. Cheng, C. Coleman, G. Diamos, D. Kanter, P. Micikevicius, D. Patterson, G. Schmuelling, H. Tang, G.-Y. Wei, C.-J. Wu.

In Proceedings of the IEEE Micro, 2020.

[MLSys 2020] MLPerf Training Benchmark

P. Mattson, C. Cheng, C. Coleman, G. Diamos, P. Micikevicius, D. Patterson, H. Tang, G.-Y. Wei, P. Ballis, V. Bittorf, D. Brooks, D. Chen, D. Dutta, U. Gupta, K. Hazelwood, A. Hock, X. Huang, B. Jia, D. Kang, N. Kumar, J. Liao, G. Ma, D. Narayanan, T. Oguntebi, G. Pekhimenko, L. Pentecost, V. Reddi, T. Robie, T. St. John, C.-J. Wu, L. Xu, C. Young, M. Zaharia.

In Proceedings of the Conference on Machine Learning and Systems (MLSys), Austin TX, 2020.

[ISCA 2020] MLPerf Inference Benchmark

V. Reddi, C. Cheng, D. Kanter, P. Mattson, G. Schmuelling, C.-J. Wu, B. Anderson, M. Breughe, M. Charlebois, W. Chou, R. Chukka, C. Coleman, S. Davis, P. Deng, G. Diamos, J. Duke, D. Fick, J. Gardner, I. Hubara, S. Idgunji, T. Jablin, J. Jiao, T. St. John, P. Kanwar, D. Lee, J. Liao, A. Lokhmotov, F. Massa, P. Meng, P. Micikevicius, C. Osborne, G. Pekhimenko, A. Rajan, D. Sequeira, A. Sirasao, F. Sun, H. Tang, M. Thomson, F. Wei, E. Wu, L. Xu, K. Yamada, B. Yu, G. Yuan, A. Zhong, P. Zhang, Y. Zhou.

In Proceedings of the ACM/IEEE International Symposium on Computer Architecture (ISCA), Valencia, Spain, 2020.

[IISWC 2020] Cross-Stack Workload Characterization of Deep Recommendation Systems

S. Hsia, U. Gupta, M. Wilkening, C.-J. Wu, G.-Y. Wei, and D. Brooks.

In Proceedings of the IEEE International Symposium on Workload Characterization (IISWC), 2020.

[TACO 2020] Exploiting Parallelism Opportunities with Deep Learning Frameworks

Y. Wang, C.-J. Wu, X. Wang, K. Hazelwood, and D. Brooks.

In Proceedings of the ACM Transactions on Architecture and Code Optimization (TACO), 2020.

[HPCA 2019] Machine Learning at Facebook: Understanding Inference at the Edge

Carole-Jean Wu, David Brooks, Kevin Chen, Douglas Chen, Sy Choudhury, Marat Dukhan, Kim Hazelwood, Eldad Isaac, Yangqing Jia, Bill Jia, Tommer Leyvand, Hao Lu, Yang Lu, Lin Qiao, Brandon Reagen, Joe Spisak, Fei Sun, Andrew Tulloch, Peter Vajda, Xiaodong Wang, Yanghan Wang, Bram Wasti, Yiming Wu, Ran Xian, Sungjoo Yoo, Peizhao Zhang.

In Proceedings of the 25th IEEE International Symposium on High Performance Computer Architecture (HPCA), Washington DC, USA, 2019.