DIVE: A Library Plug and Play Simulator with Varying Degrees of Fidelity.
Tutorial @ MICRO 23, (Toronto, Canada)
Date: 28th November, 2023 (8:00 am - 12:00 pm)
Introduction
DIVE presents a comprehensive framework designed to facilitate the simulation of Deep Neural Network (DNN) workloads across various accelerators and systems, while offering a range of simulation accuracy levels.
The framework can take Pytorch models as input, parse all the operators, and profile their performance on user-defined simulated accelerators. Users can choose anywhere from roofline models to analytical models to cycle-level simulators for the accelerator. DIVE is built on top of well-known simulators like MAESTRO and SCALE-SIM, leveraging their capabilities.
DIVE automates the generation of performance visualizations, presenting the results atop a roofline plot, and provides comprehensible metrics that enable users to gain insightful insights into the characteristics of the workload.
Using DIVE, users can simulate the Runtime, Energy, and Memory Access of Workload on any accelerator configuration.
Users can study the effect of Dataflows, Mappings, Data Reuse Patterns, Buffer Hierarchies, On-Chip Networks, and Multi-Core Systems.
Users can also plug in their custom DNN Accelerator using simple API and benefit from other supported tooling for parsing and plotting results.
Target Audience
The tutorial targets students, faculty, and researchers who want to
Understand various components of designing and modeling DNN accelerators.
Study performance implications of microarchitectural choices and modeling strategies.
A basic familiarity with DNNs and fundamental concepts in computer architecture will be adequate for participants to benefit from the tutorial.
Outcomes
After the tutorial users can quickly simulate various configurations of workloads and systems. A few illustrative examples of these configurations encompass:
Executing Resnet50/Bert on Nvidia V100 hardware.
Employing a custom model on established hardware platforms like GPUs, TPUs, MAERI, or Eyeriss.
Simulating various DNN workloads (e.g., RN50, Alexnet, BERT, LLMs, etc.) on user-defined hardware setups.
Simulating custom workloads on hardware configurations defined by the user.
Additionally, users can explore the trade-offs between simulation and accuracy by simulating the chosen hardware configuration within a matter of seconds to a few minutes, contingent on the accuracy requirements of the simulation framework.
Tutorial Schedule
8:00 - 8:25: Overview of ML systems and workloads.
8:30 - 9:30: Navigating the accuracy-fidelity trade-off for various models.
9:35 - 10:00: Ramping up and On-hands experience with DIVE.
10:00 - 10:30: Coffee Break ☕️
10:35 - 11:10: Benchmarking systems and workloads in Real-time at different fidelities.
11:15 - 11:45: Deep-dive into possible opportunities of Hardware-Software Co-Design.
11:45 - 12:00: Roadmap for future research/development.