This was a course project for CS 224D: Deep Learning for Natural Language Processing, taught by Richard Socher in Spring Quarter 2015. We compare the performance of two different types of recurrent neural networks (RNNs) for the task of algorithmic music generation, with audio waveforms as input (as opposed to the standard MIDI). In particular, we focus on RNNs that have a sophisticated gating mechanism, namely, the Long Short-Term Memory (LSTM) network and the recently introduced Gated Recurrent Unit (GRU). Our results indicate that the generated outputs of the LSTM network were significantly more musically plausible than those of the GRU.
Here is a YouTube video demonstrating our results with the LSTM network (trained on Madeon's music): https://www.youtube.com/watch?v=0VTI1BBLydE
Project Members: Aran Nayebi, Matt Vitelli
Project Writeup (6 page paper)