Ml-based procedural sound design

This thesis project explores how machine learning can support procedural sound design for explosion sound effects. I built a system that uses CLAP audio embeddings to analyze explosion recordings, organize them into functional components, and recombine ground, shock, and roar segments in Max/MSP.

The final Max/MSP patch lets users generate new explosion sounds by adjusting controls for intensity, distance, duration, and cohesion. Instead of playing back a fixed recording, the system selects and schedules individual sound components from a metadata index, allowing each generated explosion to respond to user-defined parameters.

Tools: Max/MSP, Python, CLAP, PCA, JSON, audio embeddings, procedural audio

Github Repo

View Thesis PDF

Watch Demo

system workflow

Project Highlights

Built a real-time Max/MSP synthesis system for parameter-controlled explosion generation
Used CLAP audio embeddings to represent timbral similarity between explosion components
Reduced embeddings with PCA for efficient storage and real-time retrieval
Created a JSON metadata index containing segment labels, file paths, timestamps, and PCA features
Designed JS-based selection logic for choosing ground, shock, and roar components based on user controls
Evaluated embedding behavior using nearest-neighbor accuracy, confusion matrices, cosine-distance distributions, and K-means clustering

Page updated

Google Sites

Report abuse