Ansible Configuration with LDMS
LDMS Basic Configuration
LDMS Monitoring Infrastructure Load Balancing and Configuration with Maestro
LDMS with Containers
Proposed Interoperable Analysis Framework for LDMS Data
LDMS: A Year in Review and Unveiling Confirmed Upcoming Features
Data Analysis of Omni Path (OPA) HPC Fabric Manager Output
PAPI Profiling for Performant Kokkos Applications on DOE HPC Systems
Updates on the Caliper, Darshan, and Kokkos Connectors
Monitoring for Performance in the Cloud
Analyzing and Classifying HPC Application Performance Using Application Heartbeats
Towards Performance Anomaly Detection in Production HPC Systems using Machine Learning
Darshan-LDMS Integrator: low-latency monitoring of I/O events during runtime.
Community Insights on Automated, Data-Driven Operations (from Dagstuhl 23171)
GPU Performance Bottleneck Diagnosis Using Machine Learning
This panel discussion dives into the world of leveraging both system and application monitoring data to troubleshoot HPC system and application performance issues. Experts will share insights on:
Data-Driven Troubleshooting: Identification and resolution of performance bottlenecks using various approaches and data sources
Analytic Techniques: Analytic methods for extracting actionable insights from monitoring data
Data Fidelity for Optimization: Data fidelity requirements for different data types and identify potential gaps
Knowledge Dissemination: Effective communication of analysis results to system administrators, users, and software developers
The Role of AI: Potential for use of artificial intelligence in streamlining data processing and automating troubleshooting tasks
Focus groups offer a collaborative environment where you can:
Explore LDMS topics: Delve into a variety of engaging topics relevant to your needs (details available in the sign-up form)
Gain practical insights: Learn from experts and others as well as share your own experiences to identify solutions for real-world LDMS use cases
Don't Miss Out!
Sign Up Today: Select up to 4 focus groups (ranked by preference) and briefly explain your interest and desired outcomes using the online form. This will help us in scheduling the focus groups
Deadline: Sign up by Monday, June 3rd, 2024 (spots are limited!)
Link to Sign-Up Form: https://forms.gle/gtD 3J2 GUnkZ 9n8
Focus Group Descriptions:
Provisioning LDMS (Discussion) -- Transformed to an open discussion focusing on the topic
Target Audience: This group is ideal for individuals looking to understand the resources needed to set up LDMS monitoring.
Desired Outcome: Through facilitated discussion and knowledge sharing, participants will gain an understanding of the factors that influence resource requirements for LDMS deployment. The workshop will equip participants with the knowledge and tools to estimate resource needs for their specific LDMS setup, even if precise requirements are initially unclear.
Developing Programs that Utilize LDMS Stream APIs (Discussion/Hands-on Implementation) -- Merged with Sampler Plugin Implementation
Target Audience: This group is suitable for those planning to implement executable or LDMSD plugins that publish or receive LDMS stream data.
Desired Outcome: Participants gain hands-on experience by building or troubleshooting their own software components that interact with LDMS Stream APIs (publish/receive data). Facilitators will be readily available to provide guidance and troubleshoot any coding challenges participants encounter.
Creating Maestro Configuration Files (Discussion/Hands-on Implementation) -- Cancelled
Target Audience: This group is for individuals who are interested in using Maestro for LDMS configuration.
Desired Outcome: Participants create Maestro configuration files for their specific LDMS setups.
Sampler Plugin Implementation (Discussion/Hands-on Implementation) -- Merged with Utilizing LDMS Stream APIs
Target Audience: This group is ideal for anyone interested in developing new sampler plugins to collect data not covered by existing sampler plugins.
Desired Outcome: Participants develop custom sampler plugins to collect data for LDMS. Facilitators will provide guidance and troubleshooting.
Decomposition Configuration and Deployment (Discussion/Hands-on Writing Configuration Files) -- Cancelled
Target Audience: This group is suitable for those wanting to use LDMS’s data decomposition to customize how they want to store LDMS sets.
Desired Outcome: Participants explore data decomposition strategies and draft configurations for their LDMS environment.
Distributed SOS Database Querying (Discussion) -- Cancelled
Target Audience: This group is suitable for individuals who want to explore querying data from SOS or dSOS databases for analysis and visualization.
Desired Outcome: Participants will learn effective approaches for querying distributed SOS/dSOS databases to retrieve LDMS data for further analysis and visualization.
Using LDMS Data to Troubleshoot Applications or System Performance (Discussion) -- Transformed to an open discussion focusing on the topic
Target Audience: This group is focused on leveraging monitoring data to draw a baseline, identify anomalies, and find performance bottlenecks.
Desired Outcome: They will explore techniques for data analysis to draw baselines and identify anomalies. This discussion will equip them to approach their individual troubleshooting challenges with a refined understanding of how to utilize LDMS data more effectively.