Keynote Talks

Putting the “Machine” Back in Machine Learning: The Case for System-Aware ML Design

[Presentation]

Abstract

Machine learning (ML) applications have entered and impacted our lives unlike any other technology advance from the recent past. While the holy grail for judging the quality of a ML model has largely been serving accuracy, and only recently its resource usage, neither of these metrics translate directly to energy efficiency, runtime, or mobile device battery lifetime. This talk uncovers the need for building accurate, platform‐specific power and latency models for convolutional neural networks (CNNs) and efficient hardware-aware CNN design methodologies, thus allowing machine learners and hardware designers to identify not just the best accuracy NN configuration, but also those that satisfy given hardware constraints. Our proposed modeling framework is applicable to both high‐end and mobile platforms and achieves 88.24% accuracy for latency, 88.34% for power, and 97.21% for energy prediction. We also demonstrate a novel differentiable neural architecture search (NAS) framework, dubbed Single-Path NAS, that achieves state-of-the-art top-1 ImageNet accuracy (75.62%), outperforming existing mobile NAS methods for similar latency constraints (∼80ms) and finds the final configuration up to 5,000× faster compared to prior work. Combined with our characterization, modeling, and analysis of non-volatile technology for storage, such a framework is poised to lead to true co-design of hardware and ML model, orders of magnitude faster than state of the art, while satisfying both accuracy and latency or energy constraints.

Bio

Diana Marculescu is Department Chair, Cockrell Family Chair for Engineering Leadership #5, and Professor, Motorola Regents Chair in Electrical and Computer Engineering #2, at the University of Texas at Austin. Before joining UT Austin in December 2019, she was the David Edward Schramm Professor of Electrical and Computer Engineering, the Founding Director of the College of Engineering Center for Faculty Success (2015-2019) and has served as Associate Department Head for Academic Affairs in Electrical and Computer Engineering (2014-2018), all at Carnegie Mellon University. She received the Dipl.Ing. degree in computer science from the Polytechnic University of Bucharest, Bucharest, Romania (1991), and the Ph.D. degree in computer engineering from the University of Southern California, Los Angeles, CA (1998). Her research interests include energy- and reliability-aware computing, hardware aware machine learning, and computing for sustainability and natural science applications. Diana was a recipient of the National Science Foundation Faculty Career Award (2000-2004), the ACM SIGDA Technical Leadership Award (2003), the Carnegie Institute of Technology George Tallman Ladd Research Award (2004), and several best paper awards. She was an IEEE Circuits and Systems Society Distinguished Lecturer (2004-2005) and the Chair of the Association for Computing Machinery (ACM) Special Interest Group on Design Automation (2005-2009). Diana chaired several conferences and symposia in her area and is currently an Associate Editor for IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems. She was selected as an ELATE Fellow (2013-2014), and is the recipient of an Australian Research Council Future Fellowship (2013-2017), the Marie R. Pistilli Women in EDA Achievement Award (2014), and the Barbara Lazarus Award from Carnegie Mellon University (2018). Diana is a Fellow of ACM and IEEE.

Self-aware Computing: Combining Learning and Control to Manage Complex, Dynamic Systems

[Presentation]

Abstract

Modern computing systems must meet multiple---often conflicting---goals; e.g., high-performance and low energy consumption. The current state-of-practice involves ad hoc, heuristic solutions to such system management problems that offer no formally verifiable behavior and must be rewritten or redesigned wholesale as new computing platforms and constraints evolve. In this talk, I will discuss my research on building self-aware computing systems that address computing system goals and constraints in a fundamental way, starting with rigorous mathematical models and ending with real software and hardware implementations that have formally analyzable behavior and can be re-purposed to address new problems as they emerge. These self-aware systems are distinguished by awareness of user goals and operating environment; they continuously monitor themselves and adapt their behavior and foundational models to ensure the goals are met despite the challenges of complexity (diverse hardware resources to be managed) and dynamics (unpredictable changes in input workload or resource availability). In this talk, I will describe how to build self-aware systems through a combination of control theoretic and machine learning techniques. I will then show how this combination enables new capabilities, like increasing system robustness, reducing application energy, and meeting latency requirements even with no prior knowledge of the application.

Bio

Henry Hoffmann is an Associate Professor in the Department of Computer Science at the University of Chicago. He was granted early tenure in 2018. At Chicago he leads the Self-aware computing group (or SEEC project) and conducts research on adaptive techniques for power, energy, accuracy, and performance management in computing systems. He received the Presidential Early Career Award for Scientists and Engineers (PECASE) in 2019 and the DOE Early Career Award in 2015. He completed a PhD in Electrical Engineering and Computer Science at MIT where his research on self-aware computing was named one of the ten "World Changing Ideas" by Scientific American in December 2011. He received his SM degree in Electrical Engineering and Computer Science from MIT in 2003. As a Masters student he worked on MIT's Raw processor, one of the first multicores. Along with other members of the Raw team, he spent several years at Tilera Corporation, a startup which commercialized the Raw architecture and created one of the first manycores (Tilera was sold for $130M in 2014). His implementation of the BDTI Communications Benchmark (OFDM) on Tilera's 64-core TILE64 processor still has the highest certified performance of any programmable processor. In 1999, he received his BS in Mathematical Sciences with highest honors and highest distinction from UNC Chapel Hill.

Engineering Systems that Learn

[Presentation]

Abstract

There’s a new ecosystem of applications that integrates machine learning into a variety of tasks. Typical domains have included image recognition and natural language processing. However, these techniques have also spread to computer systems domains, such as program compilation, resource scheduling, and database query optimization, yielding new computer systems that learn from data to achieve their goals. With the success of these systems, we must grapple with the reality that they model and compute with objects that are inherently approximate — real numbers (only computable up to a given precision), neural networks (only validated on a given dataset), and probabilistic computations (results only computable up to a given probability). This reality presents many engineering questions about interpreting, debugging, validating, verifying, and optimizing these systems. As an illustrative example of such a system, I'll present Ithemal, our deep learning system for performance modeling of modern computer processors. Using data and simple models, our system predicts the performance of assembly code on modern Intel CPUs better than state-of-the-art, handcrafted techniques from LLVM and Intel. Guided by Ithemal’s engineering challenges, I’ll present our work on reasoning about the semantics and performance of such a system. In particular, I’ll present our results on the semantics of sound real-valued, differentiable, probabilistic computation, which is the core computational model behind these systems. I'll also present our work on the Lottery Ticket Hypothesis, a set of techniques for producing small trainable neural networks that are 10-20% of the size of standard architectures. The promise of this latter work is not only faster inference and training, but also smaller neural networks that are more amenable to reasoning, such as verifying their robustness.

Bio

Michael Carbin is an Assistant Professor of Electrical Engineering and Computer Science at the Massachusetts Institute of Technology. At MIT, he leads the Programming Systems Group, with a primary research focus is the design of programming systems for approximate computations: computations whose results are only computed up to a given precision or probability. Typical goals for his work include improved reliability, performance, energy consumption, and resilience for computer systems. Michael has received an NSF CAREER Award, a Sloan Foundation Research Fellowship, and faculty awards from Google and Facebook. His work has received best paper awards at OOPSLA, ICLR, and ICFP. His work has also received a CACM Research Highlight. Michael received a B.S. in Computer Science from Stanford University in 2006, and an S.M. and Ph.D. in Electrical Engineering and Computer Science from the Massachusetts Institute of Technology in 2009 and 2015, respectively.