Our first goal in this project was to denoise the song that we are looking to classify in order to reduce the noise for faster and more accurate classification in the later algorithms. We did this by implementing wavelet denoising.
Next, we wanted to produce code that could output the tempo of the song. We did this by modifying code for a beat detection algorithm that outputted a tempo by choosing the highest sound energy after filtering the signal and convolving with time comb filters.
Another method of classification that we wanted to implement was key signature classification. We had two separate ideas for how to accomplish this: an FFT method and a cross-correlation method. Both methods aim to extract key notes in a song, and match these with a key signature.
Finally, we used a "bag-of-tones" machine learning method based on Mel Frequency Cepstral (MFC) coefficients to classify songs based on genre. We trained and tested the algorithm using three genres: folk, electronic, and classical.
We tested each of the algorithms separately and then included Key Signature Classification as a feature for the MFCC. We considered adding Beat Detection as an MFCC feature, but decided that the tempo of the song would have little to no correlation to the genres that we classified.