Workshop Program

9:00 AM - 9:15 AM Opening Remarks

Deniz Altınbüken, Google; Lyric P. Doshi, Google; Martin Maas, Google; Milad Hashemi, Google

9:15 AM - 10:00 AM Invited Speaker

Towards ML-augmented Database Systems

Carsten Binnig, TU Darmstadt

Abstract: Database Management Systems (DBMSs) are the backbone for managing large volumes of data efficiently in the cloud. For providing high performance, many of the most complex DBMS components such as query optimizers or schedulers involve solving non-trivial problems. To tackle such problems, recent work has outlined a new direction of so-called learned DBMSs where core parts of DBMSs are being replaced by machine learning (ML) models which have shown to provide significant performance benefits. However, a major drawback of the current approaches to enabling learned DBMS components is that they not only cause very high overhead for training an ML model to replace a DBMS component but that the overhead occurs repeatedly which renders these approaches far from practical. In the first part talk, I will present our vision of zero-shot learned DBMSs to tackle these issues. In the second part, I will then outline very recent work on ML-augmented DBMSs to extend DBMS with new capabilities such as seamless querying of multimodal data which is composed of tables, text, and images.

Bio: Carsten Binnig is a Full Professor in the Computer Science department at TU Darmstadt and a Visiting Researcher at the Google Systems Research Group. Carsten received his Ph.D. at the University of Heidelberg in 2008. Afterwards, he spent time as a postdoctoral researcher in the Systems Group at ETH Zurich and at SAP working on in-memory databases. Currently, his research focus is on the design of scalable data systems on modern hardware as well as machine learning for scalable data systems. His work has been awarded a Google Faculty Award, as well as multiple best paper and best demo awards.

10:00 AM - 10:30 AM Break

10:30 AM - 11:15 AM Invited Speaker

Architecture 2.0: Why Architects Need a Data-centric AI Gymnasium
Vijay Janapa Reddi, Harvard University

Abstract: Machine learning (ML) is revolutionizing the design of computer systems and architectures, but several challenges hinder the scalability, reproducibility, fairness, and comparability of ML-based research in this domain. This talk delves into the major challenges and emphasizes the necessity of establishing a shared ecosystem for ML-aided systems and architecture research. Such an ecosystem would provide researchers with access to public datasets, models, and a unified platform for result sharing and comparison. By facilitating resource sharing, the ecosystem would enhance fairness and reproducibility in ML systems research, enabling easy replication of work. Additionally, the shared ecosystem would aid in establishing baselines for ML systems research by offering standardized tasks and metrics (benchmarks) for performance comparison. To advance this vision, the talk introduces Arch Gym, an open-source gymnasium designed for machine learning-assisted architecture design. Lastly, this talk calls upon the community to collaborate in constructing and expanding this shared ecosystem for ML-guided systems and architecture research, so that we can collectively foster advancements in the field.

Bio: Vijay Janapa Reddi is an Associate Professor at Harvard University as well as the Vice President and a Founding Member of MLCommons (mlcommons.org), a nonprofit organization devoted to accelerating machine learning (ML) innovation for all. He co-chairs the MLCommons Research organization and sits on the board of directors of MLCommons. He co-led the development of the MLPerf Inference benchmark for IoT, mobile, edge and datacenter applications. He held the position of Associate Professor at The University of Texas at Austin's Department of Electrical and Computer Engineering before moving to Harvard. He focuses on developing computing platforms for mobile and edge computing, as well as the Internet of Things. His work is largely based on runtime systems, computer architecture, and machine learning principles. Numerous accolades and awards have been awarded to Dr. Janapa-Reddi, including the Gilbreth Lecturer Honor from the National Academy of Engineering (NAE) in 2016, the IEEE TCCA Young Computer Architect Award (2016), the Intel Early Career Award (2013), the Google Faculty Research Awards in 2012, 2013, 2015, 2017, and 2020, the Best Papers at the 2020 Design Automation Conference (DAC), the 2005 International Symposium on Microarchitecture (MICRO), and the 2009 International Symposium on High Performance Computing. Additionally, he has won various honors and awards, including IEEE Top Picks in Computer Architecture (2006, 2010, 2011, 2016, 2017, 2022, 2023). The MICRO and HPCA Halls of Fame both include him (inducted in 2018 and 2019, respectively). He is strongly devoted to expanding access to applied machine learning for STEM, diversity, and the application of AI for social good. In order to merge embedded systems and machine learning, he developed the Tiny Machine Learning (TinyML) series on edX, a massive open online course (MOOC) that thousands of students from across the world can access and audit for free. Additionally, he oversaw the Austin Hands-on Computer Science (HaCS) program, which the Austin Independent School District used to teach CS to students in grades K-12. Dr. Janapa-Reddi holds degrees in computer science from Harvard University, electrical and computer engineering from the University of Colorado at Boulder, and computer engineering from Santa Clara University. Dr. Janapa-Reddi's life's passion is helping individuals and teams learn and succeed in realizing their aspirations and to make the world a better place with technology.

11:15 AM - 12:00 PM Invited Speaker

A Learned Index for Log-Structured Merge Trees

Aishwarya Ganesan, UIUC

Abstract: Machine learning has also been transforming how we build computer systems. For example, fundamental mechanisms in computer systems such as indexing, scheduling, query processing, and caching are being replaced by learned components. In this talk, I will focus on indexing and introduce our work, Bourbon, a log-structured merge (LSM) tree that utilizes machine learning to provide fast lookups. Bourbon employs greedy piecewise linear regression to learn key distributions, enabling fast lookup with minimal computation, and applies a cost-benefit strategy to decide when learning will be worthwhile. Through a series of experiments on synthetic and real-world datasets, we show that Bourbon improves lookup performance by 1.23x-1.78x compared to state-of-the-art production LSMs. Toward the end, I will also briefly discuss our work on using learned components to speed up other aspects of LSMs.  

Bio: Aishwarya Ganesan is an Assistant Professor in the Department of Computer Science at the University of Illinois at Urbana-Champaign and an Affiliated Researcher at VMware Research. She co-leads the Distributed And Storage Systems Laboratory (DASSL) at UIUC. Her research interests are in distributed systems, storage and file systems. Her work on distributed storage reliability has exposed many severe bugs in popular distributed systems. Ideas from her research on corruption-tolerant replication are implemented in a financial database. Her work has appeared in top systems venues such as OSDI, SOSP, and FAST and has been recognized with best-paper awards at FAST 20 and FAST 18. She was selected for the Rising Stars in EECS and received the Facebook 2019 Ph.D. Fellowship.

12:00 PM - 1:00 Lunch Break

1:00 PM - 1:45 PM Invited Speaker

Designing the Next Generation Cloud Systems: An ML-Driven Approach

Christina Delimitrou, MIT

Abstract: Cloud systems are experiencing significant shifts both in their hardware, with an increased adoption of heterogeneity, and their software, with the prevalence of microservices and serverless frameworks. These trends require fundamentally rethinking how the cloud system stack should be designed. 

In this talk, I will briefly describe the challenges these hardware and software trends introduce, and discuss how applying machine learning (ML) to hardware design, cluster management, and performance debugging can improve the cloud’s performance, efficiency, predictability, and ease of use, as well as cases where alternative techniques to ML work better. I will first present Seer and Sage, two performance debugging systems that leverage ML to identify and resolve the root causes of performance issues in cloud microservices. I will then discuss Ursa, an analytically-driven cluster manager for microservices that addresses some of the shortcomings of applying ML to large-scale systems problems. 

Bio: Christina Delimitrou is an Assistant Professor at MIT, where she works on computer architecture and computer systems. She focuses on improving the performance, predictability, and resource efficiency of large-scale cloud infrastructures by revisiting the way they are designed and managed. Christina is the recipient of the 2020 TCCA Young Computer Architect Award, an Intel Rising Star Award, a Microsoft Research Faculty Fellowship, an NSF CAREER Award, a Sloan Research Scholarship, two Google Faculty Research Awards, and a Facebook Faculty Research Award. Her work has also received 5 IEEE Micro Top Picks awards and several best paper awards. Before joining MIT, Christina was an Assistant Professor at Cornell University, and received her PhD from Stanford University. She had previously earned an MS also from Stanford, and a diploma in Electrical and Computer Engineering from the National Technical University of Athens. More information can be found at: http://people.csail.mit.edu/delimitrou/

1:45 PM - 2:15 PM Interactive Breakout Discussion

Topic: Experiences & Challenges of practical deployment of ML in computer systems

2:15 PM - 3:00 PM Invited Speaker

Machine Learning for Machine Learning Compilers in Production

Mangpo Phothilimthana, Google DeepMind

Abstract: Search-based techniques have been demonstrated effective in solving complex optimization problems that arise in domain-specific compilers for machine learning (ML). Unfortunately, deploying such techniques in production compilers is impeded by several limitations. In this talk, I will present an autotuner for production ML compilers that can tune both graph-level and subgraph-level optimizations at multiple compilation stages. We demonstrate how to incorporate machine learning techniques such as a learned cost model and various learning-based search strategies to reduce autotuning time. Our learned cost model has high accuracy and outperforms a heavily-optimized analytical performance model. In an evaluation across 150 ML training and inference models on Tensor Processing Units (TPUs), the autotuner offers up to 2.4x and an average 5% runtime speedup over the heavily-optimized XLA compiler. I will outline how we deploy the learning-based XLA autotuner at datacenter scale to automatically tune the most heavily-used production models in Google’s fleet everyday, as well as the challenges that arise from employing the learned model in production. The deployed tile size autotuner has been saving approximately 2% of fleetwide TPU compute time.

Bio: Phitchaya (Mangpo) Phothilimthana is a staff research scientist at Google DeepMind, where she leads Machine Learning for Machine Learning Compilers effort (one of Google Brain moonshots in 2020). Her research interests include compilers, machine learning for systems, program synthesis, and computing sustainability. Mangpo received an undergraduate degree in Computer Science from MIT and PhD from UC Berkeley. Mangpo was a recipient of Microsoft Research PhD Fellowship and Qualcomm Innovation Fellowship.

3:00 PM - 3:30 PM Break

3:30 PM - 3:45 PM Breakout Discussion

3:45 PM - 4:00 PM Closing Remarks