We used the iPhone's built-in accelerometer as our sensor, and collected data using the MATLAB app.
The image above shows the local coordinate system for the accelerometer.
(Image via O'Reilly)
We had users play Just Dance with a mobile phone in their left hand and the Wii remote in their right. Assuming the dance moves are symmetrical, this allows us to collect data for a whole song without straining the user's right arm from excess weight.
As much as we wanted to play Just Dance many, many, many times by ourselves, we knew we would be more efficient if we reached out to other students to record more diverse dancing data (say that three times fast!). We invited our classmates to record data for us while burning a few calories.
We used the MATLAB desktop application to conduct frequency analysis on the accelerometer data that we collected and create a prediction algorithm.
(Image via UW)
(Image via MIT)
We utilized the Fourier Transform, which is a mathematical function that transforms a signal from the time domain and to the frequency domain. By doing so, a complex signal can be broken down and represented as as a sum of inter-playing sinusoidal functions. This allows us to identify common frequencies in our data, like the beat of a song; rather than comparing points over time, we can compare the relative magnitudes of different frequencies in the data.
The image on the left is a simple visualization of the Fourier Transform. A combination of the three signals together creates the yellow signal at the bottom. By decomposing the signal into its respective "ingredients", the signal can be interpreted in a useful way that isn't restricted by the time domain.
where
Equation 1: The Discrete Fourier Transform Matrix, Note that the lower equation is the value of w in the above matrix.
(Image via Wikipedia)
Because we were dealing with a discrete signal, accelerometer data over a finite time range, we transformed our data using the Discrete Fourier Transform (DFT). The equation to the left is the representation of a DFT matrix, which when applied to a signal through matrix multiplication produces a DFT.
We recorded our data at 10 Hz, which over the span of our 3:40 minute song results results in 3 DFT matrices, one for each axis, each containing just under 5 million entries. Rather than computing the entire DFT matrix for each of the playthrough's signals, which would be computationally expensive, MATLAB takes advantage of a specialized algorithm called the Fast Fourier Transform. The FFT algorithm computes the DFT of a signal by splitting it into many smaller pieces, computing the DFT of each of those subsections, and then recursively recombining them to get a complete matrix.
In signal processing, the magnitude-squared coherence between two signals is a metric that indicates how similar those two signals are in the frequency domain. We used coherence to predict player scores.
The figure on the right is an example of coherence between the accelerometer data of two players. Higher peaks correspond to higher coherence between the two signals. The high coherence at a frequency of 0Hz is related to gravity, as both players experienced the same downwards acceleration due to gravity on their phones' y-axes.
Figure 1: Coherence between two players (Elias and Meg) dancing to Forget You by CeeLo Green
Equation 2: Equation for magnitude-squared coherence between two signals, x(t) and y(t). This equation is what we used to determine coherence between signals.
(Image via Wikipedia)
In the equation to the right:
Equation 3: Equation for the cross-spectral density between two signals x(t) and y(t). This equation is used in calculating the coherence between two signals.
(Image via Wikipedia)
In the equation to the left:
Rather than write our own implementations of the Discrete Fourier Transform, a FFT algorithm, spectral density and coherence equations, we made use of the implementations packaged in MATLAB.
Code 1: Our custom function cohere
which accepts two raw signals, aligns them in the time domain, and then computes the average magnitude-squared coherence across all valid frequencies.
To organize our code, we defined a function called cohere
to utilize MATLAB's mscohere
function and compute the average magnitude-squared coherence between the two input signals.
By calculating the coherence between every play-through in our testing dataset and a well-defined best-score benchmark, we were able to populate a set of two vectors relating coherences to specific scores.
To better quantify that relationship and use it for predicting the scores of arbitrary play-throughs, we used linear regression to identify and produce a numerical mapping.
Making use of MATLAB's built-in polynomial regressor and evaluation functions polyfit
and polyval
, we were able to generate a linear mapping between coherence and score.
Dividing the predicted scores by the 2000 points, the number of points in a ⭐, we were able to generate a rough algorithm that could predict our rankings reasonably well.
Code 2: Our linear regression generated by MATLAB's polyfit
and polyval
functions.