3rd AIPerf and Optimization in the LLM World

ICPE 2025 Workshop

OVERVIEW

Artificial Intelligence (AI) has been widely adopted to investigate various mainstream domains (e.g., computer vision, natural language processing, and even reliability analysis). However, its use for performance modeling and evaluation remains limited, and its benefits to the performance engineering field are still unclear. AI tools are often employed as black-box models and are not specifically designed for performance evaluation or control. This leads to the creation of models that require substantial time and data to develop and are not easily understood by domain experts. Researchers and practitioners have recently started focusing on methods such as explainable/white-box AI-based solutions in performance engineering to address these challenges. Unfortunately, tools, methodologies, and datasets that enable wider adoption are still lacking.

Moreover, the rapid rise in popularity and adoption of large language models (LLMs) like ChatGPT has further complicated the problem of performance optimization modeling and control. LLM pre-training is expensive; ChatGPT is estimated to cost over $700,000 per day, and using GPT-4 for customer service can cost a small business over $21,000 a month. The high infrastructure, financial costs, and the need for specialized talent make LLM technology inaccessible to most organizations. Additionally, the up-front costs include the emissions generated to manufacture the necessary hardware and the cost to run that hardware during the training process, both when the machines are operating at full capacity and when they are idle. The best estimate of the dynamic computing cost for GPT-3, the model behind the original ChatGPT, is approximately 1,287,000 kWh, or 552 tons of carbon dioxide.

The goal of this workshop is to bridge this gap by promoting the dissemination of research that utilizes or studies AI techniques for the quantitative analysis of modern ICT systems, such as LLM applications, to optimize performance while reducing energy consumption and cost. To address this urgent need, the workshop brings together researchers from academia and industry to share their experiences and insights in performance engineering within the LLM domain and AI-based applications in general.

GOALS

The workshop will be composed of invited talks, work in progress and fully refereed papers and a panel.

1. Optimizing LLM Workloads on Traditional and New Architectures

2. Hardware-Assisted LLM Systems

3. LLM Optimization at Scale

4. Code generation optimization for modern hardware

5. Data-driven model identification for performance evaluation of ICT systems

6. White-box performance modeling

7. Datasets and benchmarks for training and validating AI performance models

8. Explainability and robustness assessment of AI systems in performance engineering

9. AI models for performance anomaly detection, classification, root cause analysis and remediation

10. AI models for performance tasks automation, including auto-scaling and self-optimization

The target audience:

CALL FOR PAPERS

Submission Guidelines:

A variety of contribution styles for papers are solicited including: 


Please submit the paper through HotCRP.

IMPORTANT DATES

PROGRAM COMMITTEE

Kingsum Chow 

Professor, Zhejiang University

kingsum.chow@gmail.com

Emilio Incerto

Assistant Professor, IMT School for Advanced Studies Lucca

emilio.incerto@imtlucca.it

Marin Litoiu

Professor, York University

mlitoiu@yorku.ca

Zhihao Chang

Assistant Professor, Zhejiang University

changzhihao@zju.edu.cn

Anil Rajput

AMD Fellow

Anil_Rajput@yahoo.com

Khun Ban

Cloud Performance Architect, Intel 

khunban@gmail.com

Daniele Masti

PostDoc, Gran Sasso Science Institute

daniele.masti@gssi.it

Zhiheng Lyu

University of Waterloo

z63lyu@uwaterloo.ca