Keynote Talks

Full-Stack Deep Learning

Abstract

While much attention and venture investment has focused on Neural Net Accelerator chips and architectures, the software stack that sits above these devices has not gotten comparable investment or consideration. This is a mistake! In this talk I’ll walk through a few applications in computer vision, speech, and natural language, and I will demonstrate how proper orchestration of application constraints, NN model architecture, model optimization, frameworks and libraries, and HW/SW co-design can give 10-100x improvements in latency over straightforward implementations.

Bio

Kurt Keutzer is Professor of the Graduate School in EECS at UC Berkeley, and he performs his research in the Berkeley AI Research lab. He is also co-director of the Berkeley Deep Drive consortium. Kurt’s research focused on efficient deep learning for both edge devices and the cloud. While Kurt began his career working close to silicon, over the last 40 years has worked up and down the implementation stack from applications, such as vision for autonomous driving, to HW/SW co-design of accelerators for mobile devices.

Repairing Code with Deep Learning

Abstract

While generative models for code completion are currently popular, code construction is just a small part of software development. Instead, code maintenance spans a much larger proportion of the development activity. In this talk, I will discuss deep learning models and training methods that find and fix seemingly simple but hard-to-find bugs. Specifically, they target bugs where there is a mismatch between the (latent) intent of the developer and source code. This requires models that reason over highly-structured data and code's formal semantics. Here, structured deep learning models achieve state-of-the-art performance and the trained models find previously unknown bugs in open-source projects on GitHub. I will conclude by discussing open challenges and opportunities in this area.

Bio

Miltos Allamanis is a researcher working at the intersection of machine learning, programming languages, and software engineering. His research uses the rich structure of programming languages with deep learning to create better tools for developers while using problems in this area to motivate machine learning research. He obtained my PhD from the University of Edinburgh, UK. More information about him and his publications can be found at https://miltos.allamanis.com.

Tensor Query Processing: How to Leverage AI Investments to Accelerate Databases, Classical ML Inference, and More

Abstract

In this talk, we start from the observation that investments in special HW for Neural Network is sky rocketing, and that to leverage all these new capabilities for database, classical ML, and other workloads we would incur a massive N x M engineering effort. We then focus on tensor computations as a key API supported by many vendors and NN runtime platforms, and show that we can translate (in less than 10k lines of python code) SQL queries, classical ML inference and graph algorithms to tensor computations. The results are surprisingly promising, allowing us to match and outperform both CPU and custom CPU state of the art systems. We also demonstrate unparalleled portability, running natively on CPUs, GPUs, APUs, TPUs, and even browsers and mobiles environments. We conclude by showcasing other interesting side advantages of this approach that blend SQL+ML in interesting new ways.

Bio

Carlo Curino is the lead of Gray Systems Lab (GSL), and applied research group working at the intersection of Databases/Systems/Machine Learning. Before this Carlo was a Principal Scientist in Cloud and Information Services Lab (CISL), working on large-scale distributed systems, with a focus on scheduling for BigData clusters; this line of research was co-developed with several team members and open-sourced as part of Apache Hadoop/YARN. Intrinsically, this research work enables us to operate the largest YARN clusters in the world (deployed on 250k + servers within Microsoft). Prior to joining Microsoft was a Research Scientist at Yahoo!; primarily working entity deduplication and scale and mobile+cloud platforms. Carlo spent two years as a Post Doc Associate at CSAIL MIT working with Prof. Samuel Madden and Prof. Hari Balakrishnan, working on relational databases in the cloud. At MIT he also served as the primary lecturer for the course on databases CS630, taught in collaboration with Mike Stonebraker. Carlo received a Bachelor in Computer Science at Politecnico di Milano. He participated in a joint project between University of Illinois at Chicago (UIC) and Politecnico di Milano, obtaining a Master Degree in Computer Science at UIC and the Laurea Specialistica (cum laude) in Politecnico di Milano. During the PhD at Politecnico di Milano, Carlo spent two years as a visiting researcher at UCLA.

Learning Semantic Representations of Hardware Designs

Abstract

We present Design2Vec, a deep representation learning based approach to learn semantics of hardware designs at the Register Transfer Level (RTL). Design2Vec creates an abstraction of the RTL design state space using a combination of representations based on the design graph and source code. These neural network generated design abstractions can effectively be used in different phases of chip design. This is the first work to learn a continuous representation of hardware designs that can be used for a variety of different applications like design verification, high level synthesis, efficient simulation, design understanding etc.

Here, we apply Design2Vec on two design verification tasks. Design verification is considered as the biggest bottleneck in chip design, due to the inherent theoretical complexity of this problem resulting in state space explosion. Practical industrial methods for design verification are best efforts that take unacceptably long and drain resources. We trained Design2Vec on two tasks- as a proxy simulator and for test generation. In both these tasks, Design2Vec performs much better than the state of practice in the DV industry. It also performs orders of magnitude better than blackbox ML approaches. Design2Vec can scale to real designs like the TPU, and can achieve coverage that is hard to achieve through current industrial DV approaches.

Bio

Shobha Vasudevan is a research scientist in Google Research, involved in strategy, roadmap, research and productization of AI/ML for systems. Between 2019-2022, she was a research scientist in the Brain team. Prior to joining Google, Shobha was a tenured professor at the ECE and CS departments at the University of Illinois at Urbana-Champaign. Shobha is a recipient of NSF Career award, ACM SIGDA Outstanding faculty award, IEEE Early Career Award, UIUC Deanś award for excellence in research, IBM faculty awards, Google faculty awards and several best paper awards. She serves on several IEEE boards for standards, technical program committees and ACM/IEEE journal editorial boards. She volunteers with local school districts for developing K-5 STEM modules.



Using Machine Learning to Automate Compiler Design

Abstract

Developing an optimizing compiler is a highly skilled and arduous process, and there is inevitably a software delay whenever a new processor is designed. It often takes several generations of a compiler to start to effectively exploit the processors' potential, by which time a new processor appears, and the process starts again. This never-ending game of catch-up means that we rarely fully exploit a shipped processor, which inevitably delays time to market. As the pace of hardware evolution accelerates, this problem increases. This talk will look at research undertaken in using machine learning to automate designing compiler optimization heuristics and will outline some of the challenges ahead.

Bio

Zheng Wang is an associate professor at the School of Computing at the University of Leeds. He has been working on compiler-based code optimization for 15+ years and has a unique background in working at the intersection of machine learning (ML) and compilers. He is known for his work in incorporating machine learning into compilation technology. He has published over 100 papers and received four best paper awards. His research has been successfully transferred into various industry settings.