1. Introduction
The purpose of music has always been somewhat ambiguous though, at its heart, one can argue that music serves as a means to move us and, by extension, change our physiology. Among other applications, different styles of music have a soothing effect while others get our blood pumping, whether that is their desired effect or not. Such is the case for music typically chosen for more intense activities (such as a gym workout) and, contrastingly, for calmer times (like trying to focus or fall asleep). With the advent of online music streaming services (and their now widespread use), not only has it become easier for people to find and listen to such tracks, but we now have relatively large data sets of music tracks where content providers/curators have specifically given these songs tags like “workout”, “intense”, “sleep”, “focus”.
At face value, it is clear that tracks of this sort induce high or low intensity reactions in their listeners, making them suited to their associated tags (at least as far as the users tagging them are concerned). But what is it that makes these pieces so fitting for these situations? By focusing on the “workout” and “sleep” tags, this study aims to analyse songs tagged this way in order to pick out any trends or traits linking them to that tag. Similarly, a comparison between high and low intensity music is run, looking for distinct differences between categories. Prior to the study, we expected “workout” music to have higher tempi and energy levels, inducing higher levels of arousal. Similarly, we expected lower and tempi and energy (quieter pieces) for “sleep” songs, based on the idea that they would be more peaceful and less arousing.
2. Data Collection
A dataset of 10 tracks was compiled, consisting of 5 “workout” tracks and 5 “sleep” tracks. Initially, the songs were sourced from LastFM and Spotify (two popular online music services), with preference given to tracks which appeared in both services under the same tag. At a later stage in the project we learnt that LastFM used Spotify as a source for its own library of tracks, in effect meaning that our dataset was taken from a single source. To remedy this, we decided to source all tracks from Spotify’s “workout” and “sleep” genre sections. For increased reliability, we only considered playlists with a high number of followers, as well as choosing tracks based on popularity and the repeated appearance of tracks in similar playlists. A variety of styles was purposely chosen to avoid style-specific results. Dance, rap, rock, pop and Drum ‘n Bass styles were used in the case of our “workout” set, while ambient, soundtrack and melodic piano styles were used for “sleep”.
The 10 tracks in our data set are as follows:
3. Analysis & Results
As part of looking into the traits of songs for each tag, we ran a series of feature analyses on each file available. Based on our expectations, we looked into the tempo and loudness of each piece, opting to look into chroma features at a later stage.
3.1. Tempo Analysis
Listening to both “workout” and “sleep” songs, it is generally the case that “workout” tracks are more rhythmic than their low intensity counterparts. For this reason we looked into the tempo of each piece using the Davies Beat Tracker Toolbox [2] to retrieve all beats per track. Using this data, we then measured beat intervals in order to calculate Instantaneous Beats Per Minute (BPM). The mean BPM and standard deviation was calculated per track.
Instantaneous BPM Results
Figure 1 Instantaneous BPM for "Sleep" (Left) & "Workout" (Right) Tracks [Click on plots for higher resolution]
As can be seen in the table and plots above, “sleep” tracks exhibit a slightly lower BPM when compared to “workout” tag, however the mean for each suggests no significant difference in tempo between the two. This could be due to the percussive qualities in the melodic piano tracks being picked up as beats by the algorithm used.
Our tempo results get more interesting when looking at the standard deviation of our values. As seen above, there is a far greater deviation range (7-26) for “sleep” tracks when compared to that for “workout” (1-10). This is likely due to the more melodic qualities of “sleep” tracks, as opposed to the repetitive nature of “workout” tracks.
3.2. Loudness Analysis
Further to our primary expectations, studies have suggested that loud, high energy music results in high arousal in listeners, and is hence used in aerobics classes to this effect [1]. For our purposes, we considered perceived loudness rather than energy or loudness, since these could be attributed to amplitude/gain in some cases. To retrieve these values, we used the Genesis Acoustics Loudness Toolbox [3] to calculate Perceived Instantaneous Loudness (in phons) per track. The Zwicker model for loudness was used.
Perceived Instantaneous Loudness Results
Figure 2 Perceived Instaneous Loudness for "Sleep" (Left) and "Workout" (Right) Tracks [Click on plots for higher resolution]
The above plots suggest that loudness levels are rather stable for both “workout” and “sleep” tracks, with average perceived loudness not varying too far from the mean. This is backed by similar results for standard deviation for each tag, although this does not aid in distinguishing between the two tags. It was noted, however, that loudness levels for “sleep” tracks (~79 phons) were demonstrably lower than those for “workout” tracks (~94 phons).
3.3. Chroma Analysis
Following our initial analysis of tempo and loudness, we decided to look into the melodicity for each tag. In order to do so, we looked at the chroma features for each track, these being the number of times each of the standard 12 notes/chroma were hit throughout a piece. Using the LabROSA Chroma Analysis Toolbox [4], we generated Chromagrams per track and calculated the mean number of occurrences per chroma as well as the standard deviation of occurrences.
Chroma Occurrence Results
Figure 3 Chroma Occurrences for "Sleep" (Left) and "Workout" (Right) Tracks [Click on plots for higher resolution]
Taking a look at the bar charts above (the green line showing the mean number of occurrences) doesn’t give too much information other than the spikes seen for a handful of Chroma in the “workout” case, likely due to repetitive segments in these tracks. Following our analysis, we do see a greater variance in Chroma occurrences for “Sleep” tracks, suggesting more varied melodic lines due to a wider spread of notes, further supporting the distinction between tags based on melodicity.
4. Conclusion
Through our analysis of Tempo, Loudness and Chroma features, we were able to extract the following tag characteristics for our dataset:
“Workout” Tag:
- Steady Tempo
- Higher & stable Loudness Levels
- Repetitive melody line
“Sleep” Tag:
- Varying Tempo
- Lower & stable Loudness
- More melodic & wider note range
While we acknowledge that the dataset used was limited to 10 tracks from a single data source, we did manage to draw similarities and distinct features of each tag. Given these qualities, we would argue that the steady, loud, rhythmic qualities of “workout” music may be attention grabbing and possibly even reminiscent of a person’s heartbeat, leading to higher arousal levels and an incentive to match the beat. On the other hand, the softer, melodic and varying tempo of “sleep” tracks may be perceived as a more natural sound (particularly contrasting to electronic “workout” tracks), where the variety of notes played at softer loudness levels let the listeners attention fade as the notes do. This may be the reason a high number of users feel that tracks of this sort lead to sleep-related states, allowing for thoughts and attention to wander.
5. Future Work
An interesting extension to this work would be to implement a Classification algorithm for tags of this sort. Firstly, a larger data set would need to be sourced from a greater number of sources. This data could then be used as a training set in a Supervised Learning task.
Other than a musical information analysis approach, it would also be interesting to run a physiological study on high and low valence tracks. Arousal information could be analysed using Galvanic Skin Response (GSR), heartbeat monitoring and Electroencephalography (EEG), measuring both the physical and mental state of listeners.
6. References
[1] Baldwin, Carryl L. (2012) - “Auditory Cognition and Human Performance: Research and Applications”, CRC Press
[2] Davies Beat Tracker Toolbox: https://code.soundsoftware.ac.uk/projects/davies-beat-tracker/repository
[3] Genesis Acoustics - Loudness Toolbox : http://genesis-acoustics.com/en/loudness_online-32.html
[4] LabROSA Chroma Analysis Toolbox: http://labrosa.ee.columbia.edu/matlab/chroma-ansyn/#3
[5] Karageorghis, C. I., & Priest, D.-L. (2012). Music in the exercise domain: a review and synthesis (Part I). International Review of Sport and Exercise Psychology, 5(1), 44–66.
[6] Karageorghis Costas I., Jones Leighton, Low Daniel C. - “Relationship Between Exercise Heart Rate and Music Tempo Preference”, Research Quarterly for Exercise and Sport, Vol. 77, Iss. 2, 2006