Performance Optimization In the LLM World

Performance Optimization in the LLM World

ICPE 2024 Workshop

* PerfLLM and AIPerf have been combined and planned for the afternoon of May 7th, 2024.

Workshop Agenda

OVERVIEW

The popularity and adoption of large language models (LLM) like ChatGPT has evolved rapidly. LLM pre-training is expensive. ChatGPT is estimated to cost over $700,000 per day to operate. Using GPT-4 to support customer service can cost over $21,000 a month. The high infrastructure and financial costs, coupled with the specialized talent required, make LLM technology inaccessible to most organizations.

The goal of this workshop is to address the urgency of reducing energy consumption of LLM applications, by bringing together researchers from the academia and industry to share their experience and insights in performance engineering in the LLM world.

GOALS

The half day workshop will be composed of invited talks, work in progress and fully refereed papers and a panel.

Presentations are not limited to the following topics:

1. Optimizing LLM Workloads on Traditional and New Architectures

2. Hardware Assisted LLM Systems

3. LLM Optimization at Scale

4. Code generation optimization for modern hardware

Panel Discussion (speakers from the industry and academia)

The target audience:

Researchers that are advocating new ways of optimizing LLM applications in software or hardware optimizations.
Practitioners that need to solve runtime performance problems in their LLM deployments.

CALL FOR PAPER

Submission Guidelines:

A variety of contribution styles for papers are solicited including: two-page abstracts, presentation, basic and applied research papers for novel scientific insights, industrial and experience papers reporting on education and/or practice of the application of performance engineering or benchmarks in practice, and work-in-progress/vision papers for ongoing and interesting work.

Please submit your papers. A guide is also provided to help you navigate to the workshop submission page.

https://easychair.org/conferences/?conf=icpe2024 （ A Quick Guide to Easychair Paper Submission ）

IMPORTANT DATES

PROGRAM COMMITTEE

Kingsum Chow

Professor, Zhejiang University

kingsum.chow@gmail.com

Anil Rajput

Fellow, AMD

Anil_Rajput@yahoo.com

Khun Ban

cloud performance architect, Intel

khunban@gmail.com

Chuansheng Lu

ByteDance

chuanshenglu@gmail.com

Pranita Maldikar

Performance Engineer, Virtualization, NVidia

pranitapmaldikar@gmail.com

Zhiheng Lyu

University of Hong Kong

cogito@connect.hku.hk

Yu Tang

Zhejiang University

y.tang@zju.edu.cn

Page updated

Google Sites

Report abuse