FFT Method

Step 1: Load, Process, and Trim the Audio

Step 2: Generate the Cosine Signals & the Impulses

Step 3: Take the Inner Product to Generate the Similarity Coefficient

Overview of the Method:

In the first of our two methods on key signature identification, we decomposed whatever song we were analyzing over a 30 second interval into its frequency components using an FFT. We then generated a similar FFT for each of the 12 major scale. Last, we take the inner product of the two FFTs to obtain a "similarity coefficient" that would relate the frequency values of each major key to the frequency components of the song, and theoretically the highest of these coefficients would correspond to the most probable major key of the song.

Step 1: Load, Process, and Trim the Audio

The first step of the algorithm was to load in our file, and cut out a 30 second long clip of the song for processing, and took the FFT of the audio file. We also used a threshold cutoff in the algorithm to speed up processing and improve the results of the algorithm by reducing contributions from low-magnitude frequency components.

Step 2: Generate the Cosine Signals & the Impulses

The next step was to generate the impulses at each of the frequency components of each major scale. To do this, we hard-coded the frequencies of each note in the 0th octave, and used the principle of equal temperament tuning to generate the rest of the frequencies. This means that for each note in octave "i", the note in the octave "i + 1" is double the frequency of the note in octave "i". We used this principle to generate cosine waves at each of these frequencies, and then summed the appropriate cosines to generate mixed signals of all the frequency components of each major scale. We then took the FFT of each of these summed cosine signals to obtain a series of impulses in the frequency domain at the appropriate frequencies.

Step 3: Take the Inner Product to Generate the Similarity Coefficient

After we have both of the FFTs, then we take the inner product to generate the projection of the song onto the basis of the vectors containing the key signatures' frequency values to obtain how similar the song's frequency components are to each of the key signature. We then weigh each of these coefficients and return the maximum coefficient as the most likely key signature.

Drawbacks:

While the theory seems solid, this algorithm does not work very well when implemented. The drawbacks and results of this algorithm are discussed further in the Results page.

Link to Key Signature Results

Page updated

Report abuse