1st EuroHPC malleability hackathon
Research on dynamic resource utilization for "traditional HPC workload" is one of the common research topics across several EuroHPC research projects (ADMIRE, DEEP-SEA, REGALE, Time-X) .
This hackathon will bring EuroHPC developers (and special invited guests) together for a joint effort around this topic.
Where? Université Grenoble Alpes (aka. the green capital of the alps)
Registration deadline: December 20th Extended deadline: January 6th 2023
When? 23rd - 27th January 2023
700 Av. Centrale, 38400 Saint-Martin-d'Hères, France
23rd January: Room 106
24th January: Room 106
25th January: Room 106
26th January: Room 406 (Attention! Different room!)
27th January: Room 106
Tuesday 24th January
Place: l'Épicurian, 1 Pl. aux Herbes, 38000 Grenoble
Deadline is December 20th.
We like to make this a strongly collaborative hacking event. Participation is therefore only possible if you also contribute via hacking together with others. In order to utilize the limited time in the best way, we like to gather hacking modules (see below) and collaborators on them in advance.
There are basically two possibilities to participate:
Joining an existing module: Please describe in your Email to which Module you'd like to contribute to and how this contribution would look like (what you can do for others and what others can do for you).
Suggesting a new module: If you like to work on a new module, please provide a brief description of this.
For participation, please use the registration website.
Main hacking modules
These modules will be updated on a rolling basis
Module #PMIx-SLURM: PMIx <=> SLURM dynamic resources with high priority queue
Brief description: Supporting dynamic resource utilization with PMIx using the SLURM high priority queue.
Hackers: Sergio Iserte (SLURM), Dominik Huber (PMIx), Isaías Comprés (SLURM), Martin Schreiber (Numerics)
Module #DMRLib-MPISessions: DMRLib <=> MPI Sessions
Brief description: Make DMRLib also using MPI Sessions
Hackers: Sergio Iserte (DMRLib developer), Dominik Huber (PMIx), Martin Schreiber (Numerics)
Develop first prototype of software support layer/library for applications
Supporting handling of dynamic resources with MPI Sessions, SLURM, Flux, etc.
Hackers: Sergio Iserte (DMRLib developer), Martin Schreiber (Numerics)
Module #JobAllocationGrammar [DONE]
Experiment with scale invariant mapping grammar: usage and applications (see paper **draft** here https://www.dropbox.com/s/ripxypicbpkdg3u/RCOMP.pdf?dl=0 ) — the goal is to discuss potential applications and integrations of such syntax while implementing those.
Hackers: Jean-Baptiste Besnard (ParaTools SAS), Martin Schreiber (Numerics), Isaías Comprés (Slurm expert)
Job tracking (resource blaming) store job to resource correspondence in traces / logs
Job clustering how to match multiple instances of jobs including at various scales (go beyond the argv array)
Look at how to leverage the rich MPI tools interface (MPI-T) to expose portability and collect some application info
Hackers: Jean-Baptiste Besnard (ParaTools SAS), Isaías Comprés (Slurm expert), Pierre-François Dutot (REGALE)
Module: #DMRlib-reconf: Extend DMRlib with novel reconfiguration techniques
Implement new spawning methods in DMRlib
Adopt new reconfiguration policies for DMRlib
Hackers: Sergio Iserte (DMRLib developer), Iker Martín (MPI Developer), Martin Schreiber (Numerics)
Module: #Collocation: Collocation of HPC and Big Data workloads:
Implement or improve Bebida collocation tool support for OAR and Slurm
Add support for execution of Big Data workloads with deadline guarantees
Hackers: Michael Mercier (Bebida developer), Adrien Faure (OAR developer), Olivier Richard (OAR developer), Pierre-François Dutot (REGALE)
Module: #BatSim/Sched: Simulation of jobs with dynamic resources
Explore ways to use BatSim for running jobs with dynamic resources
Explore ways to incorporate experimental schedulers
Hackers: Adrien FAURE (REGALE), Martin SCHREIBER (TIME-X)
Module #[Please provide a module identifier during the registration] Brief description: [Please provide a description of your module during the registration] Hackers: [Please provide a potential collaborator for this topic]
Preliminary, will be updated more and more!
23rd January 2023 (Monday):
9:30 - 12: Short presentations
Presentation by Dominik Huber
Presentation by Isaias Comprez
Presentation by Jean-Baptiste Besnard
13-17: Hacking sessions
24th January 2023 (Tuesday):
Presentation by Sergio Iserte
13-17: Hacking sessions
25th January 2023 (Wednesday):
9-12: Hacking sessions & Talks
Presentation by Michael Mercier
13-17: Hacking sessions & Talks
Presentation by Adrien FAURE
26th January 2023 (Thursday):
9-12: Hacking sessions
13-17: Hacking sessions & Talk
Tutorial by Adrien FAURE about NIX
27th January 2023 (Friday):
9-12: Final presentations and discussions
We plan to have mini-talks given by the participants on the first day as well as mini-presentations to summarize the hacking efforts for each module at the end of this meeting.
Jean-Baptiste Besnard (ParaTools SAS, ADMIRE)
Alberto Cascajo (Universidad Carlos III de Madrid, ADMIRE)
Isaías Comprés (Technical University of Munich, DEEP-SEA)
Pierre-François Dutot (LIG, REGALE) [missing hacking project]
Adrien FAURE (LIG/UGA, REGALE)
Dominik Huber (Technical University of Munich, TIME-X)
Sergio Iserte (BSC, DEEP-SEA)
Iker Martín (BSC, DEEP-SEA)
Michael Mercier (RYAX Technologies, REGALE)
Daniel Milroy (LLNL, external expert) [Keynote speaker, remote talk on Flux, participation via Zoom in selected hacking groups]
Olivier Richard (LIG/UGA, REGALE)
Martin Schreiber (Université Grenoble Alpes / Technical Univ. Munich, TIME-X)
Martin Schreiber: martin.schreiber AT univ-grenoble-alpes.fr
Pierre-François Dutot: pierre-francois.dutot AT univ-grenoble-alpes.fr