Basic LDMS Configuration
Deployment of LDMS Monitoring System Using Containers
LDMS Monitoring Infrastructure Load Balancing and Configuration with Maestro
Analysis and Visualization for System and Application Improvement
LDMS Metric Set Decomposition for Storage
Provisioning the LDMS Monitoring Infrastructure on Large-scale Systems
Distributing LDMS Metric Data Using Avro on the Kafka Bus
7:30 - 8:00am
Light breakfast – Baked goods, fruits, and coffee
8:00 - 8:30am
Check-In (please have your e-ticket with you for check-in)
8:30 - 9:30 am
Welcome/Keynote
9:30 - 10:00 am
Lightning Talk: ANL Site Status -- Ben Lenard
Lightning Talk: NCSA Site Status -- Mike Showerman
10:00 -10:30 am
Break
10:30 am - 12:00 pm
User Presentation: Kokkos Tools Sampler for Data Order Reduction in HPC Systems Monitoring -- Vivek Kale, Vanessa Surjadidjaja, Christian Trott and Jim Brandt
User Presentation: Using Darshan-LDMS for Analyzing and Diagnosing I/O Performance in HPC Systems -- Ana Solorzano, Sara Walton, Ben Schwaller, Jim Brandt, Devesh Tiwari and Rhan Basu Roy
Lightning Talk: Process Event Monitoring and Application Identification with LDMS -- Ben Allan
12:00 - 1:00 pm
Lunch
1:00 - 2:30 pm
Tutorial: Basic LDMS Configuration -- Sara Walton
2:30 - 3:00 pm
Break
3:00 - 4:30 pm
Tutorial: LDMS Monitoring Infrastructure Load Balancing and Configuration with Maestro -- Nick Tucker
4:30 - 5:00 pm
End-of-Day Group Discussions
7:30 - 8:30am
Light breakfast – Baked goods, fruits, and coffee
8:30 - 10:00 am
Demo: Analysis and Visualization for System and Application Improvement -- Ben Schwaller
10:00 - 10:30 am
Break
10:30 - 12:00 am
Tutorial: Deployment of LDMS Monitoring System Using Containers -- Narate Taerat
12:00 am - 1:00 pm
Lunch
1:00 - 2:30 pm
User Presentation: Towards Performance Anomaly Diagnosis in Production HPC Systems using Machine Learning -- Burak Aksar, Efe Sencan, Ben Schwaller, Vitus Leung, Jim Brandt, Brian Kulis, Manuel Egele and Ayse Coskun
User Presentation: Lessons from Monitoring AI-workload Driven Supercomputers -- Devesh Tiwari
User Presentation: Dynamic Time Warping (DTW) and HPC Monitoring Data Clustering and Anomaly Detection -- Ben Schwaller, Solji Shin and Mueen Abdullah
2:30 - 3:00 pm
Break
3:00 - 4:00 pm
Tutorial: Provisioning the LDMS Monitoring Infrastructure on Large-scale Systems -- Jim Brandt
4:00 - 4:30pm
LDMS-UG Introduction -- Jim Brandt
4:30 - 5:00 pm
End-of-Day Group Discussions
7:30 - 8:00am
Light breakfast – Baked goods, fruits, and coffee
8:00 - 8:30am
Lightning talk: LRZ's HPC Monitoring -- Michael Ott
8:30 - 10:00am
Tutorial: LDMS Metric Set Recomposition for Storage -- Nichamon Naksinehaboon
10:00 - 10:30 am
Break
10:30 - 11:00 am
User Presentation: LLNL Site Update -- Christ Morrone
11:00 - 12:00pm
Demo: Distributing LDMS Metric Data Using Avro on the Kafka Bus -- Tom Tucker
12: 00 - 1:00 pm
Lunch
1:00 - 2:30 pm
Round-Table Development Discussions
2:30 - 3:00 pm
Break
3:00 - 5:00 pm
Open Interaction & Troubleshooting/Discussion Sessions/Breakouts