Title: Parallelized Harmony Search Algorithm for High-Dimensional Optimization
Tech Stack: C++, MPI, OpenMP, PBS Professional (HPC job scheduler)
Platform: University of Trento HPC Cluster — 126 nodes, 6,092 CPU cores, 53 TB RAM
Objective: Accelerate a nature-inspired optimization algorithm using distributed and shared-memory parallel computing
Benchmark Functions: Rosenbrock (unimodal) and Michalewicz (multimodal)
Execution Modes: Fixed-iteration (task-based) and target-fitness (threshold-based)
A metaheuristic optimization algorithm inspired by how musicians improvise to find a perfect harmony
Maintains a Harmony Memory (HM) — a pool of candidate solutions that evolves over iterations
Three core operators drive the search:
Memory Consideration (HMCR): Draw values from existing good solutions
Pitch Adjustment (PAR): Fine-tune a chosen value by a small random amount
Randomization: Inject fresh random values to explore new regions of the search space
Iteratively replaces the worst solution in memory when a better candidate is found
Widely applied to structural design, scheduling, clustering, feature selection, and more
Phase 1 — Sequential Baseline: Built a clean, modular C++ implementation of the Harmony Search algorithm to establish reference performance
Phase 2 — MPI Parallelization (Distributed Memory): Implemented three strategies (PHS1, PHS2, PHS3) using inter-process communication across cluster nodes
Phase 3 — OpenMP Parallelization (Shared Memory): Implemented corresponding strategies using multithreading within a single node
Inspired by Hong et al.'s three parallelization paradigms originally developed for the Asymmetric Traveling Salesman problem—adapted here for continuous optimization with real-valued vectors.
Speedup: Both MPI and OpenMP implementations achieved significant reductions in execution time relative to the sequential baseline
MPI-PHS3 (no communication) consistently delivered strong performance: minimal overhead, maximum parallel exploration
MPI-PHS2 showed advantages on larger, higher-dimensional problems where aggressive exploitation of elite solutions pays off
OpenMP-PHS3-Batched offered a practical compromise: nearly independent exploration with lightweight periodic knowledge sharing
Scalability Limits: Communication overhead and synchronization costs became dominant at higher process/thread counts, flattening the speedup curve
Solution Quality: Parallel variants maintained competitive (and sometimes superior) solution quality compared to the sequential baseline, thanks to increased diversity from multiple independent or semi-independent search trajectories