Workshop 5
A Hands-On Workshop for Benchmarking AI in Engineering Design
A Hands-On Workshop for Benchmarking AI in Engineering Design
Tuesday 7 July 2026, 2:00 pm to 5:30 pm
Workshop Chairs
Matthew Keeler, D-MAVT, ETH Zürich <mkeeler@ethz.ch>
Soheyl Massoudi, D-MAVT, ETH Zürich <smassoudi@ethz.ch>
Mark Fuge, D-MAVT, ETH Zürich <mafuge@ethz.ch>
Faez Ahmed (MIT, USA)
Wei (Wayne) Chen (Texas A&M University, USA)
Zhenghui Sha (UT Austin, USA)
Xingang Li (U. Melbourne, Australia)
Haluk Akay (TU Delft, Netherlands)
Generative AI (GenAI) methods—including large language models, diffusion models, and variational autoencoders—are increasingly being applied to engineering design problems such as topology optimization, shape synthesis, and inverse design. Yet the field lacks standardized, reproducible infrastructure for evaluating these methods across diverse engineering domains. Without common benchmarks, it is difficult to compare approaches, identify failure modes, or measure real progress.
This workshop explores challenges to rigorous benchmarking within GenAI for Engineering Design via both a hands-on and discussion-based format. First, it introduces participants to EngiBench, an open-source framework published at NeurIPS 2025 that provides a standard- ized API, curated datasets, and physics-based simulators for engineering design problems spanning structural, thermal, aerodynamic, photonic, and electronic domains (see Figure 1).
EngiBench is designed to drastically reduce the cost and complexity of benchmarking in Engi- neering Design, including the fundamental tools a researcher would need to run a benchmark for a paper: each problem bundles a simulator, dataset, evaluation pipeline, and standardized interface so that researchers can go from idea to reproducible results with minimal overhead. In the second part of the workshop, we engage in more open-ended discussion around future challenges in the field relating to benchmarking and how to address them.
The specific goals of this workshop are:
1. Hands-on experience with a full benchmarking pipeline. Participants will work through the complete research workflow using EngiBench: exploring a curated dataset, implementing a design problem, integrating a GenAI method, running physics-based evaluation, and interpreting results. By the end of the session, participants will have executed a self-contained benchmark study.
2. Lower the barrier to reproducible AI-for-design research. Many researchers in design computing face significant setup costs when applying ML methods to engineering prob- lems (custom simulators, bespoke datasets, ad-hoc evaluation). We will demonstrate how EngiBench’s standardized API eliminates this overhead, enabling researchers to rapidly prototype and compare approaches.
3. Discuss the broader role of benchmarking infrastructure for GenAI in design. Beyond the hands-on component, we will facilitate a structured discussion on where benchmark frameworks like EngiBench should be deployed next, what new problem domains or evaluation criteria are needed, and how the design computing community can collec- tively build shared infrastructure for measuring progress in AI-driven design. We aim to identify gaps in current benchmarking practice and opportunities for community-driven contributions.
Figure 1: Overview of the supported engineering domains in EngiBench.
Workshop Format
The workshop is structured in two phases: a guided hands-on tutorial followed by a facilitated discussion. Participants will need a laptop with internet access; all code runs in Google Colab notebooks that we will prepare in advance so participants can execute them directly, or locally viapip install engibench, requiring no local simulator installation for the tutorial problems.
This workshop is intended for graduate students, researchers, and industry practitioners interested by the intersection of artificial intelligence and engineering design. This includes individuals focused on design optimization, core AI/ML methods-for-design, and applied AI.
We welcome participants from all design backgrounds. While no prior experience with En- giBench is necessary, attendees participating in the hands-on tutorial will find it helpful to have a basic familiarity with foundational machine learning concepts. The hands-on coding exercises will be conducted in Python; however, the EngiBench API is designed to be highly intuitive, meaning researchers accustomed to MATLAB or other scripting languages will easily be able to follow along. Participants without this coding background are still highly encouraged to join, as they can pair up during the tutorial and contribute valuable domain expertise to the broader discussions on benchmarking methodology.
To participate in the hands-on session, attendees should bring a laptop with internet access.
For the lightning talk slots, we will invite short abstracts (up to 300 words) describing the applicant’s work at the intersection of generative AI and engineering design, with emphasis on benchmarking or evaluation challenges encountered.
All attendees at the workshop need to register either as an addition to the DCC'26 conference registration at a cost of €27.50 (€25 + VAT), or if not registered for the conference at a cost of €55 (€50 + VAT). Please go the DCC'26 Registration page to add this workshop to your registration.