Program

The emergence of generative artificial intelligence (AI) at the end of 2022 marked a significant evolution, as these technologies increasingly integrate into daily human activities. The scale of generative models has been expanding annually at an exponential rate, with architecture advancements evolving from single-model frameworks to complex multi-expert systems. As industries across the board incorporate generative AI into their operational processes, concerns regarding information security and controllability have come to the forefront. However, the high deployment costs associated with generative AI technologies pose a significant barrier to widespread adoption. In response, Phison has introduced the aiDAPTIV+ initiative, a revolutionary approach that leverages NAND Flash to expand High Bandwidth Memory (HBM), thereby substantially reducing costs. This presentation will cover the technological innovations and ecosystem developments surrounding the aiDAPTIV+ initiative, highlighting its potential to democratize access to generative AI technologies by addressing cost and scalability challenges.

Keynote Speech II

Topic

From YOLO_v4 to YOLO_v7

廖弘源所長 (Dr. Mark Liao)
Institute of Information Science, Academia Sinica

Abstract

YOLOv4 and YOLOv7 are respectively developed by Professor Mark Liao's team at the Institute of Information Science, Academia Sinica, in 2020 and 2022, and released as open source code for free use by the industry. Various research units, companies, and academic institutions worldwide have adopted YOLOv4 and YOLOv7 as the first step in their development work. The popularity of these two object detection systems can be seen from the widespread citation of their papers. Specifically, since the release of its source code in April 2020, YOLOv4 has been cited over 15,000 times. On the other hand, YOLOv7, which had its source code released in July 2022 and its paper published in the top computer vision conference CVPR in June 2023, has been cited over 4,600 times in just over a year. Many medical institutions have integrated YOLOv4 and YOLOv7 with endoscopes for applications such as detecting organ abnormalities, gastrointestinal inspections, and cardiac ultrasound examinations.

Invited talk I (Andes)

Topic

Exploring ML Computing Acceleration Using RISC-V Custom Extensions and IREE Compiler

柯俊男 (Chun-Nan Ke), Senior Technique Manager, Compute Acceleration Division

李岳峰 (Dr. Yueh-Feng Lee), Manager, Compute Acceleration Division

Andes Technology

Abstract

This presentation focuses on highly efficient matrix multiplication instructions utilizing Customized Extension for RISC-V CPUs, specifically tailored for artificial intelligence (AI) applications. This proposed Integrated Matrix Extension aims to achieve scalability/portability, higher computing power, optimized compute-to-memory interaction, and reduced memory access bandwidth requirements. This work also proposes integrated Smart-LSU for Matrix Tiling Load/Store enhancements and the Zero-Overhead Boundary Handler for reducing user configuration cycles. By combining these state-of-the-art techniques, the architecture demonstrates significant performance enhancements for AI applications. Preliminary performance data would be shared to highlight the benefits and the potential acceleration of GeMM Kernels and MobileNetv1 Model landing. These performance metrics, including the kernel loop MAC utilization rate of up to 80% and a compute-to-memory ratio of exceeding 9.6 through unrolling techniques.

In this presentation, we also describe the experience of integrating custom extension instruction into the IREE compiler framework. Specifically, the exponential function custom instruction, exp, is integrated into the IREE compiler using the ukernel approach. By utilizing the IREE ukernel approach, hand-crafted softmax can speedup of up to 5x compared to direct RVV code generation. Furthermore, when integrating the exp instruction into ukernel, custom instructions can speed up softmax of up to 10x. The ukernel integration is also utilized to evaluate MobileBERT model inference.

Invited talk II (Google)

Topic

Google and Google Taiwan Introduction

張傳華博士 (Dr. Chuan-Hua Chang)

Senior Technical Manager, Advanced CPU Division, Google

Abstract

This presentation will offer an overview of Google and Google Taiwan, highlighting our thriving hardware R&D sector. We'll discuss our substantial growth in this area, as evidenced by the opening of our second hardware office building at the TPark campus in New Taipei City in April.

Invited talk III (Skymizer)

Topic

Computation in Large Language Models

Luba Tang
Founder and Chief Executive Officer, Skymizer Inc.

Abstract

Large language models (LLMs) have become increasingly prominent in various computing platforms, such as data centers, smartphones, and microcontrollers. As such, their potential applications in both human-to-machine interfaces (HMIs) and machine-to-machine interfaces (M2MI) have been extensively studied and discussed. However, implementing hyper-scale LLMs on heterogeneous multicore System-on-Chips (SoCs) poses significant challenges. In this talk, we will discuss the key challenges, including issues related to hardware/software interfaces, verification, and SoC architecture exploration.

Page updated

Google Sites

Report abuse

Program

第一天 (2024 年 5 月 30 日 星期四)

第二天 (2024 年 5 月 31 日 星期五)

Keynote Speech I

Keynote Speech II

Invited talk I (Andes)

Invited talk II (Google)

Invited talk III (Skymizer)

第一天 (2024 年 5 月 30 日星期四)

第二天 (2024 年 5 月 31 日星期五)