Program

Albert Cohen

Google

Retire Linear Algebra Libraries

Abstract: Despite decades of investment in software infrastructure, scientific computing, signal processing and machine learning systems are stuck in a rut. Some numerical computations are more equal than others: BLAS and the core operations for neural networks achieve near-peak performance, while marginally different variants do not get this chance. As a result, performance is only achieved at the expense of dramatic loss of programmability. Compilers are obviously the cure. But what compilers? How should these be built, deployed, retargeted, autotuned? Sure, the BLAS API is not the ultimate interface to compose and reuse high-performance operations, but then, what would be a better one? And why did we not build and agree on one yet? We will review these questions and some of the proposed solutions in this talk. In particular, establishing a parallel with hardware-software codesign, we will advocate for a new tile-level programming interface sitting in-between the top-level computational operations and generators of target- and problem-specific code. Building on the MLIR infrastructure, we will propose a structured approach to the construction of domain-specific code generators for tensor compilers, with the stated goal of improving the productivity of both compiler engineers and end-users.

Short Bio: Albert is a research scientist at Google. An alumnus of École Normale Supérieure de Lyon and the University of Versailles, he has been a research scientist at Inria from 2000 to 2018, a visiting scholar at the University of Illinois, an invited professor at Philips Research, and a visiting scientist at Facebook Artificial Intelligence Research. Albert Cohen works on parallelizing and optimizing compilers, parallel and synchronous programming languages, machine learning compilers, with applications to high-performance computing, artificial intelligence and reactive control.

Hussam Amrouch

University of Stuttgart

HW/SW Codesign for Reliable Hyperdimensional

In-Memory Computing on Unreliable Ferroelectric FET

Abstract: DNNs largely overwhelm conventional computing systems because the latter is severely bottlenecked by the data movement between processing units and memory. As a result, novel and intelligent computing systems become inevitable. In this talk, I will be focusing on the emerging ferroelectric (FeFET) technology and its great potential in building efficient in-memory computing architectures. I will present unprecedented dual-port FeFET which is completely disturbance free demonstrating its ability to offer scaling down the FE layer thickness down to merely 3nm while it can still store 8 states (i.e., 3-bit MLC-FeFET). Further, I will also explain how abstracted reliability models can be developed from device physics to circuits and later employed at the system and algorithm levels towards realizing HW/SW codesign for robust in-memory computing. I will discuss how FeFET-based in-memory computing outstandingly synergizes with Binarized Neural Networks (BNNs) and brain-inspired Hyperdimensional Computing (HDC) in which reliable machine learning can be realized on unreliable emerging FeFET technologies.

Short Bio: Hussam Amrouch is a J.-Professor heading the Chair of Semiconductor Test and Reliability (STAR) within the University of Stuttgart. He received his Ph.D. degree with the highest distinction (summa cum laude) from KIT in 2015. He currently serves as Editor at the Nature Portfolio for the Nature Scientific Reports Journal. He has around 190 publications (including 76 journals, one of them in Nature SR) in multidisciplinary research areas across the computing stack, starting from semiconductor device physics to circuit design all the way up to computer architecture. His main research interests are design for reliability from device physics to systems, machine learning for CAD, HW security, and emerging technologies with a special focus on ferroelectric devices. He has given 12 tutorials in top EDA conferences such as DAC, DATE, ICCAD and more than 25 invited talks in many international universities and EDA companies like Synopsys. He is a reviewer in many top journals such as Nature Electronics, TED, TC, TCAS-I, etc. His research in HW security and circuit reliability have been funded by the German Research Foundation (DFG), Advantest Corporation, and the U.S. Office of Naval Research (ONR).

Una Pale

EPFL

Challenges and Solutions for Hyperdimensional Computing in Wearable Healthcare Applications

Una Pale, Thomas Teijeiro, and David Atienza

Abstract: Hyperdimensional computing is a novel approach to machine learning inspired by neuroscience, which uses vectors in a hyper-dimensional space to represent both data and models. This approach has gained significant interest in recent years with applications in various domains. Its advantages, such as fast and energy-efficient learning, and the potential for online and privacy-preserving distributed learning, make it interesting for power-efficient applications, such as continuous biosignal monitoring through wearable devices. In this talk, we present the potential of HD computing for wearable healthcare applications, but also the challenges that it faces, from the first-hand experience at the Embedded Systems Laboratory of the EPFL. Big data with spatio-temporal structure, highly unbalanced datasets, need for interpretable and online retraining solutions are some of the topics that will be presented. The potential for personalized models while leveraging generalized models will be also discussed. Finally, we will focus on the need and latest progress in creating open-source libraries for HD computing research.

Short Bio: Una Pale received her B.Sc. and M.Sc. degrees in electrical engineering from the University of Zagreb, Croatia, in 2014 and 2016, respectively. She is currently working towards a Ph.D. degree in the Embedded Systems Laboratory at the Swiss Institute of Technology Lausanne (EPFL). During an exchange with the Technical University of Vienna she decided to direct her research interest towards biomedical applications. She worked for two years as a Research Assistant with the Clinical Neuroengineering Laboratory, EPFL. Her research interests include biomedical signal processing, machine learning for health related applications, and the design and application of hyperdimensional computing algorithms.

Short Bio: Tomás Teijeiro (his picture is attached) received his PhD from the Centro Singular de Investigación en Tecnoloxías Intelixentes (CITIUS), University of Santiago de Compostela, Spain, in 2017. He is currently An Assistant Professor at Mathmode group at the University of the Basque Country (UPV/EHU), Spain.His research interests include knowledge representation, non-monotonic temporal reasoning, event-based sensing, and their application to biosignal abstraction and interpretation in energy-efficient setups.

Short Bio: David Atienza is a professor of electrical and computer engineering, and head of the Embedded Systems Laboratory (ESL) at EPFL, Switzerland. He received his PhD in computer science and engineering from UCM, Spain, and IMEC, Belgium, in 2005. His research interests include system-level design methodologies for high-performance multi-processor system-on-chip (MPSoC) and low power Internet-of-Things (IoT) systems, including particularly edge AI architectures for wearables and IoT systems, as well as thermal-aware design for MPSoCs. He has received several awards and he is a co-author of more than 350 papers, one book and 14 patents in these fields. He is an IEEE Fellow, an ACM Distinguished Member, and served as IEEE CEDA President (period 2018-2019).

Weier Wan

Stanford University

RRAM Compute-In-Memory Hardware for Efficient, Versatile, and Accurate Edge Inference

Abstract: Performing ever-demanding AI tasks directly on the resource-constrained edge devices calls for unprecedented energy-efficiency of edge AI hardware. AI hardware today consumes most energy through data movement between separate compute and memory units. Compute-in-memory (CIM) architectures using Resistive RAM (RRAM) overcome this challenge by storing weights in dense analog RRAM array and performing computation directly within memory. However, the energy-efficiency benefit of CIM usually comes at the cost of functional flexibility and computational accuracy, hampering its practical use for many edge applications that require processing multiple modalities of sensory data (e.g. image, audio). Such trade-offs between efficiency, versatility and accuracy cannot be addressed by isolated improvements on any single layer of the design.

In this talk, we present our attempts to ameliorate this fundamental trade-off through a full-stack co-optimization from device and circuit to architecture and algorithm. By integrating multiple innovations including a voltage-mode sensing scheme, a transposable neurosynaptic array architecture, and non-ideality-aware model training and fine-tuning techniques, we demonstrated NeuRRAM, a RRAM CIM chip that simultaneously delivers a high degree of reconfigurability for diverse model architectures, state-of-the-art energy-efficiency, and software-comparable inference accuracy measured across various AI benchmarks including image classification, speech recognition, and image recovery.

Short Bio: Dr. Weier Wan is a recent Ph.D. graduate from Stanford University, Department of Electrical Engineering. His research has focused on building energy-efficient artificial intelligence chips using the compute-in-memory architecture. His research efforts span the full-stack of AI system, including AI algorithms, chip architectures, mixed-signal circuits, and memory device technologies. His work has been published in top journals and conferences, including Nature, International Solid-State Circuits Conference (ISSCC), and Symposium on VLSI Technology and Circuits. Previously, he received his master’s degree from Stanford University in 2018 and his bachelor’s degree from University of California, Berkeley in 2015.

Anand Raghunathan

Purdue University

Narrowing the Energy Efficiency Gap between Artificial and Natural Intelligence

Abstract: Improvements in compute performance have been a major enabler of the advances in AI over the past decade. However, we are at a point where demands from future AI workloads will far outpace expected improvements in hardware, threatening to greatly impede continued progress in the field of AI. This talk will review the challenges posed by AI workloads across the computing spectrum, examine the often-cited efficiency gap between artificial intelligence and biological intelligence, and provide a possible roadmap to narrowing this gap, including in-memory computing, algorithm-hardware co-design and neuromorphic computing.

Short Bio: Anand Raghunathan received the B. Tech. degree in Electrical and Electronics Engineering from the Indian Institute of Technology, Madras, India, and the M.A.and Ph.D. degrees in Electrical Engineering from Princeton University, Princeton, NJ. He is the Silicon Valley Professor and Chair of the VLSI area in the School of Electrical and Computer Engineering at Purdue University.He serves as Associate Director of the SRC/DARPA Center for Brain-inspired Computing (C-BRIC) and founding co-director of the Purdue/TSMC Center for a Secured Microelectronics Ecosystem(CSME). His areas of research include brain-inspired computing, energy-efficient machine learning and artificial intelligence, system-on-chip design and computingwith post-CMOS devices.He holds a Distinguished Visiting Chair in Computational Brain Research at the Indian Institute of Technology, Madras.He is a co-founder and Director of Hardware at High Performance Imaging,Inc., company commercializing Purdue innovations in the area of computational imaging. Before joining Purdue, he was a Senior Researcher and Project Leader at NEC Laboratories America and held a visiting position at Princeton University. Prof. Raghunathan has co-authored a book, eight book chapters, and over 300 refereed journal and conference papers, and holds 28 U.S patents and 16 international patents. His work has received nine best paper awards, a retrospective ten-year most impactful paper award, one design contest award and seven best paper nominations at premier IEEE and ACM conferences. He received a Patent of the Year Award (an awardrecognizing the invention that has achieved the highest impact), and two Technology Commercialization Award from NEC. He also received the IBM Faculty Award and Qualcomm Faculty Award. He was chosen by MIT's Technology Review among the TR35 (top 35 innovators under35 years, across various disciplines of science and technology) in 2006, for his work on "making mobile secure".He also received the Distinguished Alumnus Award from IIT Madras. Prof. Raghunathan has chaired four premier IEEE/ACM conferences, and served on the editorial boards of various IEEE and ACM journals in his areas of interest. He received the IEEE Meritorious Service Award and Outstanding Service Award. He is a Fellow of the IEEE and Golden Core Member of the IEEE Computer Society.

Yun Eric Liang

Peking University

AHS: An Agile Framework for Hardware Specialization and

Software Mapping

Abstract: As Moore’s law is approaching to the end, designing specialized

hardware accelerator along with the software that map the applications

onto the specialized hardware is a promising solution.

The hardware design determines the peak performance, while the

software is also important as it determines the actual performance.

Hardware/software (HW/SW) co-design can optimize the hardware

acceleration and software mapping in concert and improve

overall performance. However, the current flow designs hardware

and software in isolation. More importantly, both hardware and

software are difficult to design and optimize due to the low level

programming and huge design space. In this talk, we will introduce

AHS, an Agile framework for Hardware specialization and Software

mapping for AI applications. AHS can automatically partition the

hardware and software space, generate the hardware accelerator,

and map the software onto it synergistically.

Short Bio: Prof. Yun Eric Liang is currently an Associate Professor (with tenure) in the School of Integrated Circuit at Peking University and a member of Center for Energy-efficient Computing and Applications (CECA). His research interest is at the hardware/software interface with work spanning electronic design automation (EDA), hardware/software co-design, and computer architecture. He has authored over 100 scientific publications in the leading international journals and conferences. His research has been recognized with two Best Paper Awards (FCCM 2011 and ICCAD 2017), six Best Paper Award Nominations (PPoPP 2019, DAC 2017, ASPDAC 2016, DAC 2012, FPT 2011, CODES+ISSS 2008). He currently serves as Associate Editor of the ACM Transactions on Embedded Computing Systems (TECS) and ACM Transactions on Reconfigurable Technology and Systems (TRETS). He is the program chair of International Conference on Field Programmable Technology (FPT) 2022 and International Conference on Application-specific Systems, Architecture and Processors (ASAP) 2019 and the subcommittee chair of Asia South Pacific Design Automation Conference (ASPDAC) 2014. He also serves in the program committees in the premier conferences including DAC, ICCAD, DATE, ASPDAC, FPGA, FCCM, HPCA, MICRO, PACT, CGO, ICS, CC, CASES, LCTES, ASAP, and ICCD.

Lilas Alrahis

New York University

Graph Neural Networks for Hardware Design, Security and Reliability

Abstract: Graph Neural Networks (GNNs) successfully facilitate learning on graph-structured data, such as social networks, recommendation systems, and protein-protein interactions. Since electronic circuits can be represented naturally as graphs, GNNs provide great potential to advance Machine Learning (ML)-based methods for all aspects of electronic system design and Computer-Aided Design (CAD). This talk provides an overview of how GNNs get designed and employed to learn the properties of circuits. Taking hardware security and circuit reliability assessments as target applications, this talk first illustrates how GNNs aid in analyzing flattened/unstructured gate-level netlists, then demonstrates how to employ GNNs to accurately estimate the impact of process variations and device aging on the delay of any path within a circuit.

Short Bio: Lilas Alrahis received the M.Sc. degree and the Ph.D. degree in Electrical and Computer Engineering from Khalifa University, UAE, in 2016 and 2021, respectively. She is currently a Postdoctoral Associate with the Design for Excellence Lab, headed by Prof. Ozgur Sinanoglu, in the Division of Engineering, at the New York University Abu Dhabi (NYUAD), UAE. Her current research interests include Hardware Security, Design-for-Trust, Logic Locking, Applied Machine Learning, and Digital Logic Design. Dr. Alrahis is currently serving as Associate Editor of the Integration, the VLSI Journal.

Rasit O. Topaloglu

IBM

Introduction to Quantum Computing and Machine Learning for Drug Discovery

Abstract: This talk first provides an introduction to quantum computing, then proceeds to give an application of quantum machine learning towards drug discovery.

Short Bio: Rasit Topaloglu obtained his B.S. in EE from Bogazici University and Ph.D. in Computer Science and Engineering from University of California at San Diego. He has worked for companies such as Qualcomm, AMD, GLOBALFOUNDRIES and is currently with IBM. He works on next-generation computer design currently as a Senior Hardware Developer. He was involved with qubit characterization laboratory work at IBM. He has over seventy peer-reviewed publications and over seventy issued patents, more than thirty of which are on quantum technologies. His book on Design Automation for Quantum is out from Springer in 2022. He serves on IEEE/ACM Design Automation Conference (DAC) and IEEE International Symposium on Quality Electronic Design (ISQED) Technical Program Committees that cover quantum topics. He serves as the Chair of IEEE Mid-Hudson and the Secretary of ACM Poughkeepsie. He is an IEEE/ACM Design Automation Conference Outstanding Innovator and an IBM Master Inventor.

Cong "Callie" Hao

GeorgiaTech

The DAC System Design Contest 2018 - 2022: Lessons Learned in Edge Computing

Abstract: The Design Automation Conference System Design Contest (DAC-SDC) is a low-power object detection contest operating on edge FPGA and GPU. The goal is to achieve both high detection accuracy and low energy. The contest has been run for five years from 2018 to 2022, and the performance of winning designs drastically improved by more than 100x. In this talk, we will review the past five years’ achievements and the technologies that have led to such drastic performance improvements. we will also discuss the lessons we have learned from this prestigious contest.

Short Bio: Cong (Callie) Hao is an assistant professor in ECE at Georgia Tech, where she currently holds the Sutterfield Family Early Career Professorship. She was a postdoctoral fellow in the School from 2020-2021 and also worked as a postdoctoral researcher in ECE at the University of Illinois at Urbana-Champaign from 2018-2020. She received the Ph.D. degree in Electrical Engineering from Waseda University in 2017. Her primary research interests lie in the joint area of efficient hardware design and machine learning algorithms, especially reconfigurable and high-efficiency computing and building useful electronic design automation tools.

Jinwook Jung

IBM Research

AI/ML-Infused Digital IC Design Workflows on the Hybrid Cloud

Abstract: As the complexity of modern hardware systems explode, fast and effective design space explorations for better IC (Integrated Circuits) implementations become more and more difficult to achieve due to higher demands of computational resources. Recent years have seen increasing use of decision intelligence in IC design flows to navigate the design solution space in a more systematic and intelligent manner. To address these problems, IBM Research has been working on the AI/ML-infused IC design orchestration project in order 1) to enable the IC design environment on Hybrid Cloud platform so that we can easily scale up/down the workloads according to the computation demands, and 2) to produce higher Quality-of-Results (QoRs) in shorter total Turn-Around-Time (TAT). In this talk, I will describe how we provide a scalable IC design workload execution that produces higher performance designs by utilizing AI/ML-driven automatic parameter tuning capability. We first see how we can build a cloud-based IC design environment including the containerized digital design flow on Kubernetes clusters. Then, we extend the containerized design flow with the automatic parameter tuning capability using AI/ML techniques. Finally, we demonstrate that the automatic parameter tuning can be executed in more scalable and distributable manners using the Ray platform. I will use the actual design environment setups, the code snippets, and results from the product IC designs as evidence that the proposed method can produce higher quality of IC designs using the automatic parameter tuning methodologies.

Short Bio: Jinwook Jung is a Research Staff Member at IBM TJ Watson Research Center, Yorktown Heights, NY. At IBM, he works to advance design methodologies for AI accelerators and high-performance microprocessors, leveraging machine learning and cloud computing. He received his PhD from Korea Advanced Institute of Scienced and Technology.