Performance Optimization in the LLM World
ICPE 2024 Workshop
* PerfLLM and AIPerf have been combined and planned for the afternoon of May 7th, 2024.
Workshop Agenda
OVERVIEW
The popularity and adoption of large language models (LLM) like ChatGPT has evolved rapidly. LLM pre-training is expensive. ChatGPT is estimated to cost over $700,000 per day to operate. Using GPT-4 to support customer service can cost over $21,000 a month. The high infrastructure and financial costs, coupled with the specialized talent required, make LLM technology inaccessible to most organizations.
The goal of this workshop is to address the urgency of reducing energy consumption of LLM applications, by bringing together researchers from the academia and industry to share their experience and insights in performance engineering in the LLM world.
GOALS
The half day workshop will be composed of invited talks, work in progress and fully refereed papers and a panel.
Presentations are not limited to the following topics:
1. Optimizing LLM Workloads on Traditional and New Architectures
2. Hardware Assisted LLM Systems
3. LLM Optimization at Scale
4. Code generation optimization for modern hardware
Panel Discussion (speakers from the industry and academia)
The target audience:
Researchers that are advocating new ways of optimizing LLM applications in software or hardware optimizations.
Practitioners that need to solve runtime performance problems in their LLM deployments.
CALL FOR PAPER
Submission Guidelines:
A variety of contribution styles for papers are solicited including: two-page abstracts, presentation, basic and applied research papers for novel scientific insights, industrial and experience papers reporting on education and/or practice of the application of performance engineering or benchmarks in practice, and work-in-progress/vision papers for ongoing and interesting work.
Please submit your papers. A guide is also provided to help you navigate to the workshop submission page.
https://easychair.org/conferences/?conf=icpe2024 ( A Quick Guide to Easychair Paper Submission )
IMPORTANT DATES
PROGRAM COMMITTEE
Kingsum Chow
Professor, Zhejiang University
kingsum.chow@gmail.com
Anil Rajput
Fellow, AMD
Anil_Rajput@yahoo.com
Khun Ban
cloud performance architect, Intel
khunban@gmail.com
Chuansheng Lu
ByteDance
chuanshenglu@gmail.com
Pranita Maldikar
Performance Engineer, Virtualization, NVidia
pranitapmaldikar@gmail.com
Zhiheng Lyu
University of Hong Kong
cogito@connect.hku.hk
Yu Tang
Zhejiang University
y.tang@zju.edu.cn