Any auto-encoding deep learning model has a latent space of input representations at its heart. While this space is generally very high-dimensional and therefore difficult to interpret, common projection methods are often used to reduce its dimensionality and facilitate a closer study of its structure. Inducing some structure into the latent space during training is another active area of research, that helps make the latent space more interpretable.
This project, carried out as a hack at the AIxMusic hackathon at the Ars Electronica festival 2020, aimed at creating a virtual spatial environment in which every point was mapped to a point in a dimension-reduced latent space of the MusicVAE model by Google Magenta trained on fixed-length note sequences.
As a user moved around in the space and clicked at different points in it, corresponding melodies got triggered. Points closer in the space triggered similar sounding melodies. The audio playback was also spatialized, so that as the user moved farther away from a chosen point, the sound also faded away.
Watch the final demo in the video below!
Collaborators: Damian T. Dziwis, Eric Thalhammer, Adrien Bitton
Point-cloud Credits: Krzesło Kantora - a tree (CC Attribution) auriea (auriea.art) (link)