Updates

Week #1 (5/8/20)

This first week was spent doing research and preliminary work. Particularly, I learned that the frequency spacing between piano keys increases as one goes up in frequency. My original plan was to restrict the input frequencies to two octaves since two octaves is just big enough to include on the GUI. Thus, I might have better performance in spectral analysis with larger frequency spacing. I will start with the middle C key on my GUI since it is very commonly used, but I can move up in octaves if I need better accuracy. Another useful finding is that there is a formula for finding the frequency of a key based on key number. This takes out the need for hard-coding frequencies.

Additionally, I have adapted a PianoView I found implemented by Sylvian Saurel <https://medium.com/@ssaurel/creating-a-virtual-piano-for-android-b6d3ac05d961>. I used Saurel's code tutorial for rendering the PianoView, but I need to add several functionalities specific to my project. For instance, I must add the functionality which will receive the frequencies of notes played and match them with the corresponding keys on the GUI to be highlighted.

Week #2 (5/15/20)

This week, I successfully implemented audio sampling with the AudioRecord API. I am able to configure the sampling frequency, the mode of sampling (mono/stereo), and the number of bits per sample. I can read these samples into a buffer for FFT processing. On the topic of FFT processing, I successfully integrated the FFT library from Columbia University MEAPsoft with my audio sampling. I am able to read my samples into a buffer as 16 bit integers, convert them to doubles, and perform the FFT on the samples.

Looking forward, I will be experimenting with different sample rates and bin sizing for the FFT. A starting point for the sampling frequency is to follow the Nyquist Sampling Theorem and sample at around double the maximum frequency (on a piano the maximum is about 5 kHz). The number of samples need to be a power of 2. By picking different sampling rates and sample numbers, I am able to adjust the bin size of the FFT, giving me more or less resolution frequency resolution.

Week #3 (5/22/20)

The past week was very productive. I am now able to find the frequency of single notes played within a couple Hz. This accuracy is achieved by sampling with a frequency of 11.025 kHz and performing the FFT on 1024 samples at a time. It turns out that the sampling frequency cannot be arbitrarily chosen just with the Nyquist criterion. Certain Android devices can support certain sampling frequencies, but from my research, it turns out that 11.025 kHz is supported by most devices. With single note accuracy, I am currently able to deliver on the promise of at least one note detection. I have great confidence that I can deliver on multiple note detection.

I will now experiment with different spectrum magnitude thresholds to find a certain amount of maxima needed for multiple note recognition. I will then need to perform some matching whereby the maxima that are detected are matched to the key frequencies, while throwing out the overtones and harmonics that are generated. An idea for this I had is to compute the cross-correlation between the key frequencies and the locations of the maxima. Peaks in the cross-correlation that occur within a reasonable region around the key frequencies will be the keys that are played, anything else will be discarded.

Week #4 (5/29/20)

My app can now successfully display played keys on the PianoView interface. I have implemented two different modes. The first one is single note mode, which can detect single notes played at a time and light up the keys on the PianoView with extremely high accuracy. The way I do this is that I find the correlation with detected key magnitudes and magnitudes I put into a look up table with previous recordings I made. The result is very high-accuracy real-time note recognition. I found that simply taking the maximum of the magnitudes was still causing many harmonics to show up. This correlation method fixes that issue.

The other mode is multiple note mode. I take the mean and standard deviation of the magnitudes and take the magnitudes that are half a standard deviation above the mean as the notes played. This results in lower accuracy than the single note mode since the piano produces so many harmonics. Multiple notes are able to be detected and displayed; however, there is a slight accuracy issue when using it with a piano. If I test the multiple note mode with tone generators that produce single coherent frequencies, then everything works perfectly, as expected.

The next steps are final finishing touches to the app. I will first try to implement screen recording within the app using the MediaProjection API. Once that is complete, I will look to add some additional small touches and finalize the UI (outside of the PianoView).

Week #5 (6/5/20)

My app is now fully complete! I have completed the app with the functionality I wanted and discussed in the initial presentation, and all milestones have been met. Users can now start/stop audio sampling and perform screen recording at the same time. I opted to use a different library from MediaProjection for screen recording; I instead used HBRecorder. The screen captures are put into the movies folder of the device. The UI is also completed with a clean and functional interface for different mode selections that does not interfere with the actual PianoView. I also extensively checked the app performance in different scenarios such as rotations, selecting different options at different times, and everything works perfectly. As discussed before, single note mode is extremely accurate when using a piano. The multiple note mode suffers in accuracy a bit when using a piano due to the many harmonics; however, the correct functionality can be verified with tone generators.

Overall, I had a lot of fun with this project. I solidified my Android programming, and I was able to incorporate my learning from my signal processing classes into this project. I have never done real-time FFT, so it was very rewarding to see it work. The last step is to prepare for presentations!