3rd AIPerf and Optimization in the LLM World
ICPE 2025 Workshop
OVERVIEW
Artificial Intelligence (AI) has been widely adopted to investigate various mainstream domains (e.g., computer vision, natural language processing, and even reliability analysis). However, its use for performance modeling and evaluation remains limited, and its benefits to the performance engineering field are still unclear. AI tools are often employed as black-box models and are not specifically designed for performance evaluation or control. This leads to the creation of models that require substantial time and data to develop and are not easily understood by domain experts. Researchers and practitioners have recently started focusing on methods such as explainable/white-box AI-based solutions in performance engineering to address these challenges. Unfortunately, tools, methodologies, and datasets that enable wider adoption are still lacking.
Moreover, the rapid rise in popularity and adoption of large language models (LLMs) like ChatGPT has further complicated the problem of performance optimization modeling and control. LLM pre-training is expensive; ChatGPT is estimated to cost over $700,000 per day, and using GPT-4 for customer service can cost a small business over $21,000 a month. The high infrastructure, financial costs, and the need for specialized talent make LLM technology inaccessible to most organizations. Additionally, the up-front costs include the emissions generated to manufacture the necessary hardware and the cost to run that hardware during the training process, both when the machines are operating at full capacity and when they are idle. The best estimate of the dynamic computing cost for GPT-3, the model behind the original ChatGPT, is approximately 1,287,000 kWh, or 552 tons of carbon dioxide.
The goal of this workshop is to bridge this gap by promoting the dissemination of research that utilizes or studies AI techniques for the quantitative analysis of modern ICT systems, such as LLM applications, to optimize performance while reducing energy consumption and cost. To address this urgent need, the workshop brings together researchers from academia and industry to share their experiences and insights in performance engineering within the LLM domain and AI-based applications in general.
GOALS
The workshop will be composed of invited talks, work in progress and fully refereed papers and a panel.
Presentations are not limited to the following topics
1. Optimizing LLM Workloads on Traditional and New Architectures
2. Hardware-Assisted LLM Systems
3. LLM Optimization at Scale
4. Code generation optimization for modern hardware
5. Data-driven model identification for performance evaluation of ICT systems
6. White-box performance modeling
7. Datasets and benchmarks for training and validating AI performance models
8. Explainability and robustness assessment of AI systems in performance engineering
9. AI models for performance anomaly detection, classification, root cause analysis and remediation
10. AI models for performance tasks automation, including auto-scaling and self-optimization
Panel Discussion (speakers from the industry and academia)
The target audience:
Researchers that are advocating new ways of optimizing LLM applications in software or hardware optimizations
Practitioners that need to solve runtime performance problems in their LLM deployments
Researchers and Practitioners interested in performance optimization, modeling and control of modern ICT applications
CALL FOR PAPERS
Submission Guidelines:
A variety of contribution styles for papers are solicited including:
Regular research papers (max 8 pages). Novel contributions that are fully validated and well positioned in the state-of-the-art
Empirical, experience, reproduction or case study papers (max 6 pages). Work-in-progress; Vision papers; New ideas that still need to be validated; Industrial case studies; Experiences (positive or negative) of using AI for performance engineering
Please submit the paper through HotCRP.
IMPORTANT DATES
PROGRAM COMMITTEE
Kingsum Chow
Professor, Zhejiang University
kingsum.chow@gmail.com
Emilio Incerto
Assistant Professor, IMT School for Advanced Studies Lucca
emilio.incerto@imtlucca.it
Marin Litoiu
Professor, York University
mlitoiu@yorku.ca
Zhihao Chang
Assistant Professor, Zhejiang University
changzhihao@zju.edu.cn
Anil Rajput
AMD Fellow
Anil_Rajput@yahoo.com
Khun Ban
Cloud Performance Architect, Intel
khunban@gmail.com
Daniele Masti
PostDoc, Gran Sasso Science Institute
daniele.masti@gssi.it
Zhiheng Lyu
University of Waterloo
z63lyu@uwaterloo.ca