Carole-Jean Wu is a Research Scientist at Facebook AI Research. Her research focus is in the domain of computer system architecture with particular emphasis on energy- and memory-efficient systems. Her recent work has pivoted into designing systems for machine learning execution at-scale, such as for personalized recommender systems and mobile deployment. In general, she is interested in tackling system challenges to enable efficient, responsible AI execution. Carole-Jean chairs the MLPerf Recommendation Benchmark Advisory Board, co-chaired MLPerf Inference, and serves on the MLCommons Board as a director.
Carole-Jean holds tenure from ASU (Associate Professor). She received her M.A. and Ph.D. from Princeton and B.Sc. from Cornell. She is the recipient of the NSF CAREER Award, Facebook AI Infrastructure Mentorship Award, the IEEE Young Engineer of the Year Award, the Science Foundation Arizona Bisgrove Early Career Scholarship, and the Intel PhD Fellowship, among a number of Best Paper awards.
System design and optimization for machine learning
High-performance and energy-efficient heterogeneous CPU+GPU systems
Performance quality modeling and energy efficiency optimization for mobile systems
Memory system optimization
Energy harvesting and temperature-aware management for portable electronics
Honors and Awards
2021 IEEE Micro Top Picks
2021 IEEE Micro Top Picks Honorable Mention
2020 Distinction of Honorable Mention of the CRA Anita Borg Early Career Award
2019 Facebook AI Infrastructure Mentorship Award
2019 ACM/IEEE ICSE Genetic Improvement on Software Best Paper Award
2018 IEEE ITHERM Best Paper Award
2017 NSF CAREER Award
2017 IEEE Young Engineer of the Year Award
2015 IEEE Best of Computer Architecture Letters
2013 SFAz Bisgrove CAREER Award
2011 Intel PhD Fellowship
2011 IEEE ISPASS Best Paper Nomination
2009 Princeton Excellence in Leadership Award
2006 Princeton PhD Fellowship
Industry Initiatives and Open Source Software
Embedded Vision Summit [Slide Deck][Talk]
FBTT-Embedding: Tensor Train Compression Library for Embeddings
CLEAR: Computing Landscapes for Environmental Accountability and Responsibility
PERSONAL: Personalized Recommendation Systems and Algorithms
DeepRecSys: A System for Optimizing End-To-End At-scale Neural Recommendation Inference
AutoScale: Energy Efficiency Optimization of Stochastic Edge Inference Using Reinforcement Learning
GEVO: Genetic Improvement of GPU Code
DORA: Optimizing Smartphone Energy Efficiency and Web Browser Performance under Interference
MobileBench: Performance, Energy Characterizations and Architectural Implications of an Emerging Mobile Platform Benchmark Suite
Conference Steering Committee
IEEE Intl. Symp. on Performance Analysis of Systems and Software (ISPASS), 2018-23.
Conference Program Chair
Conference Program Committee
HPCA 2020; 2014-17
IISWC 2019; 2013-17
MICRO 2014; 2016-17
Journal Editorial Board
IEEE Micro 2019-21
IEEE Computer Architecture Letters 2019-21
[MLSys-2021] "TT-Rec: Tensor Train Compression for Deep Learning Recommendation Model Embeddings,"
C. Yin, B. Acun, X. Liu, and C.-J. Wu
MLSys Outstanding Paper Award
[MLSys-2021] "CPR: Understanding and Improving Failure Tolerant Training for Deep Learning Recommendation with Partial Recovery,"
K. Maeng, S. Bharuka, I. Gao, M. Jeffrey, V. Saraph, B.-Y. Su, C. Trippel, J. Yang, M. Rabbat, B. Lucia, and C.-J. Wu.
[ASPLOS-2021] "RecSSD: Near Data Processing for Solid State Drive Based Recommendation Inference,"
M. Wilkening, U. Gupta, S. Hsia, C. Trippel, C.-J. Wu, D. Brooks, G.-Y. Wei.
[HPCA-2021] "Chasing Carbon: The Elusive Environmental Footprint of Computing,"
U. Gupta, Y. Kim, S. Lee, J. Tse, H.-H. Lee, G. Wei, D. Brooks, and C.-J. Wu.
[HPCA-2021] "Understanding Training Efficiency of Deep Learning Recommendation Models at Scale,"
B. Acun, M. Murphy, X. Wang, J. Nie, C.-J. Wu, and K. Hazelwood.
[MICRO-2020] "AutoScale: Energy Efficiency Optimization for Stochastic Edge Inference Using Reinforcement Learning,"
Y. Kim and C.-J. Wu.
[ISCA-2020] “DeepRecSys: A System for Optimizing End-to-end At-scale Neural Recommendation Inference,”
U. Gupta, S. Hsia, V. Saraph, X. Wang, B. Reagen, G.-Y. Wei, H.-S. Lee, D. Brooks, and C.-J. Wu.
IEEE Micro Top Picks Honorable Mention
[ISCA-2020] “RecNMP: Accelerating Personalized Recommendation with Near-Memory Processing,"
L. Ke, U. Gupta, B. Cho, D. Brooks, V. Chandra, U. Diril, A. Firoozshahian, K. Hazelwood, B. Jia, H.-S. Lee, M. Li, B. Maher, D. Mudigere, M. Naumov, M. Schatz, M. Smelyanskiy, X. Wang, B. Reagen, C.-J. Wu, M. Hempstead, X. Zhang.
[ISCA-2020] “MLPerf Inference Benchmark,"
V. Reddi, C. Cheng, D. Kanter, P. Mattson, G. Schmuelling, C.-J. Wu, B. Anderson, M. Breughe, M. Charlebois, W. Chou, R. Chukka, C. Coleman, S. Davis, P. Deng, G. Diamos, J. Duke, D. Fick, J. Gardner, I. Hubara, S. Idgunji, T. Jablin, J. Jiao, T. St. John, P. Kanwar, D. Lee, J. Liao, A. Lokhmotov, F. Massa, P. Meng, P. Micikevicius, C. Osborne, G. Pekhimenko, A. Rajan, D. Sequeira, A. Sirasao, F. Sun, H. Tang, M. Thomson, F. Wei, E. Wu, L. Xu, K. Yamada, B. Yu, G. Yuan, A. Zhong, P. Zhang, Y. Zhou.
IEEE Micro Top Picks
[MLSys-2020] “MLPerf Training Benchmark,”
P. Mattson, C. Cheng, C. Coleman, G. Diamos, P. Micikevicius, D. Patterson, H. Tang, G.-Y. Wei, P. Ballis, V. Bittorf, D. Brooks, D. Chen, D. Dutta, U. Gupta, K. Hazelwood, A. Hock, X. Huang, B. Jia, D. Kang, N. Kumar, J. Liao, G. Ma, D. Narayanan, T. Oguntebi, G. Pekhimenko, L. Pentecost, V. Reddi, T. Robie, T. St. John, C.-J. Wu, L. Xu, C. Young, M. Zaharia.
[HPCA-2020] “The Architectural Implications of Facebook’s DNN-based Personalized Recommendation,"
U. Gupta, C.-J. Wu, X. Wang, M. Naumov, B. Reagen, D. Brooks, B. Cottel, K. Hazelwood, M. Hempstead, B. Jia, H.-H. Lee, A. Malevich, D. Mudigere, M. Smelyanskiy, L. Xiong, X. Zhang.
[HPCA-2019] “Machine Learning at Facebook: Understanding Inference at the Edge,”
Carole-Jean Wu, David Brooks, Kevin Chen, Douglas Chen, Sy Choudhury, Marat Dukhan, Kim Hazelwood, Eldad Isaac, Yangqing Jia, Bill Jia, Tommer Leyvand, Hao Lu, Yang Lu, Lin Qiao, Brandon Reagen, Joe Spisak, Fei Sun, Andrew Tulloch, Peter Vajda, Xiaodong Wang, Yanghan Wang, Bram Wasti, Yiming Wu, Ran Xian, Sungjoo Yoo, Peizhao Zhang.
[HPCA-2019] “Understanding the Future of Energy Efficiency in Multi-Module GPUs,”
Akhil Arunkumar, Evgeny Bolotin, David Nellans, and Carole-Jean Wu.
[HPCA-2018] “LATTE-CC: Latency Tolerance Aware Adaptive Cache Compression Management for Energy Efficient GPUs,"
Akhil Arunkumar, Shin-Ying Lee, Vignesh Soundararajan, and Carole-Jean Wu.
[ISCA-2017] “MCM-GPU: Multi-Chip-Module GPUs for Continued Performance Scalability,”
Akhil Arunkumar, Evgeny Bolotin, Benjamin Cho, Ugljesa Milic, Eiman Ebrahimi, Oreste Villa, Aamer Jaleel, Carole-Jean Wu, and David Nellans.
[HPCA-2016] “Improving Smartphone/Mobile User Experience by Balancing Performance and Energy with Probabilistic QoS Guarantee,”
Benjamin Gaudette, Carole-Jean Wu, and Sarma Vrudhula.
[ISCA-2015] “CAWA: Coordinated Warp Scheduling and Cache Prioritization for Critical Warp Acceleration for GPGPU Workloads,”
Shin-Ying Lee, Akhil Arunkumar, and Carole-Jean Wu.
[PACT-2014] “CAWS: Criticality-Aware Warp Scheduling for GPGPU Workloads,”
Shin-Ying Lee and Carole-Jean Wu.
[MICRO-2011] “PACMan: Prefetch-Aware Cache Management for High Performance Caching,”
Carole-Jean Wu, Aamer Jaleel, Will Hasenplaugh, Margaret Martonosi, Simon Steely Jr., and Joel Emer.
[MICRO-2011] “SHiP: Signature-Based Hit Predictor for High Performance Caching,"
Carole-Jean Wu, Aamer Jaleel, Margaret Martonosi, Simon Steely Jr., and Joel Emer.
Mentorship of Interns and Students
Udit Gupta (2018 — present; PhD candidate; Harvard University)
Kiwan Maeng (2020; PhD candidate; CMU)
Chunxing Yin [with Bilge Acun] (2020; PhD candidate; Georgia Tech)
Mike Lui (2019-20; PhD candidate, Drexel University)
Emma Yu Wang [with Xiaodong Wang] (2019; PhD, Harvard University)
Jhe-Yu Liou (PhD candidate; 2015 — present)
Young-Geun Kim (Post-doctoral Researcher, 2019 — 2020) [First employment: Soongsil University]
Akhil Arunkumar (PhD 2018) [First employment: Samsung Austin R&D Center; AMD] [Memory Subsystem Optimization Techniques for Modern High-Performance General-Purpose Processors]
Viraj Wadhwa (High school intern from BASIS Chandler Primary, 2017-18. Now an undergraduate student at UT-Austin) [Improving Image Recognition with Tensor Flow API for Autonomous Driving]
TJ Smith (Research Experience for Undergraduates (REU) from Princeton EE; 2017)
Katherine Hann (High school intern from Xavier College Preparatory High School, 2017. Now an undergraduate student at University of Pennsylvania) [Designing A Paired Robotic Car Indoor Navigation and Tracking System]
Rashmi Athavale (High school intern from Hamilton High School, 2017. Now an undergraduate student at Georgia Tech) [Designing A Paired Robotic Car Indoor Navigation and Tracking System]
Benjamin Gaudette (PhD 2017; co-advised with Prof. Sarma Vrudhula) [First employment: Benchmark Electronics; Intel] [An Intelligent Framework for Energy-aware Mobile Computing Subject to Stochastic System Dynamics]
Ying-Ju Yu (Post-doctoral Researcher, 2016-17) [First employment: Intel]
Shin-Ying Lee (PhD 2017) [First employment: Samsung Austin R&D Center; AMD] [Intelligent Scheduling and Memory Management Techniques For Modern GPU Architectures]
Received the Outstanding Computer Engineering PhD Graduate Student Award
Kody Stribrny (BS 2017; co-advised with Prof. Sarma Vrudhula) [First employment: Amazon] [Honors Thesis: Mobile Waterway Monitor]
Davesh Shingari (MS 2016) [First employment: Marvell] [Memory Interference Characterization and Mitigation for Heterogeneous Smartphones]
Soochan Lee (PhD 2015; co-advised with Prof. Patrick E. Phelan) [First employment: LG Electronics] [A Study of Latent Heat of Vaporization in Aqueous Nanofluids]
Ryan Brazones (BS 2014) [First employment: Intel]
Dhinakaran Pandiyan (MS 2014) [First employment: Intel] [Data Movement Energy Characterization of Emerging Smartphone Workloads for Mobile Platforms]
Received the Outstanding Computer Engineering MS Graduate Student Award
Amrit Panda (PhD 2014; co-advised with Prof. Karam S. Chatha) [First employment: Qualcomm Research; Microsoft] [StreamWorks: An Energy-efficient Embedded Co-processor for Stream Computing]