Search this site
Embedded Files
DH100 Group 8
  • Home
  • Project
  • Data Critique
  • Annotated Bibliography
  • About
DH100 Group 8
  • Home
  • Project
  • Data Critique
  • Annotated Bibliography
  • About
  • More
    • Home
    • Project
    • Data Critique
    • Annotated Bibliography
    • About

Project

Code for Visuals:

github.com/21ShisodeParth/nbn

github.com/albertvasquez/DigHum100

Song Duration

How has song length changed over time?

Interesting Takeaways

  1. As the year increases, the variability of song duration appears to increase as well.

  2. The dataset itself seems to provide more data the more recent the year is.

Data Cleaned

  • Songs in the dataset with a year or duration of value '0'


It makes sense that the data is more heavily centered to recent years. Only in the past few decades did music and songs' metadata start to be digitized. If a song's data is digitized it would make it much easier to find this data to include it in a dataset. Even though there is more data on songs on the last few decades, it is still conclusive that variability in song duration increases as the year increases; if this were not the case, the data for earlier years would be just as sporadic as that for later years, but there would simply be a lesser density of data points; however, this is clearly not the case.

Scatterplot of song length in seconds from 1926 to 2010. There are far more songs from the 1990s and 2000s. There is more variety in song length in the 90s and 00s than in earlier decades.

Artist Location

Where are artists whose songs are featured in the dataset primarily from?

Interesting Takeaways

  1. Western Europe and the Eastern United States seem to be the largest hubs for music; at least, these artist locations were the most included within the dataset.

  2. Extremely minimal artists from the non-English speaking world were included.

Data Cleaned

  • Songs with artists' locations of latitude or longitude of exactly '0'


It seems that the dataset wasn't inclusive for music around the globe. Especially in more recent years, more and more data should be available on the top songs within each country's music industry. However it should be stated that if this dataset is more based on what Western countries deem popular (ex. via Grammys, Billboard charts, etc.) then it makes sense why the dataset is so heavily centered in Europe and the USA. If the rest of the world truly looks to American measures such as the Grammys and Billboard charts to assess what music is popular in general then this map shows USA and Western Europe as the hub for music.

A map of the world with dots representing artist location. The Eastern United States and the United Kingdom have by far the highest density of dots, followed by the Western United States and Western Europe. All other parts of the world have very low artist density..

St. Dev. and Mean of BPM by Genre

What is the relationship between a song's genre and its BPM?

Interesting Takeaways

  1. The average BPM by genre barely differs.

  2. The standard deviation of the BPM per genre does not significantly differ, but there are certainly noticeable differences.

Data Cleaned

  • Genres in the dataset were included if they contain over 130 songs (top 10 genres)


When tuning into any popular radio station, we have all had a friend say something along the lines of "all of this sounds the exact same." Dance pop is typically played on popular music stations, and often appears formulaic to many people. Meanwhile, genres such as jazz and hip hop have historically widely differed in their production. Although the average BPM of a song by genre does not differ very much, insight can certainly be made into the differences in BPM variability per genre.

Colored bar chart showing mean beats per minute for ten of the most popular subgenres. Seven of the genres are within 5 bpm of 120; only post-grunge, latin jazz, and roots reggae are at or above 130 bpm.
Colored bar chart showing the standard deviation in beats per minute for ten of the most popular subgenres. Four groups are clustered around a standard deviation of40 bpm: hip hop, chanson, latin jazz, and gangster rap. Five groups are clustered around a standard deviation of 30 bpm: contemporary christian, post-grunge, dance pop, blues-rock, and roots reggae.

St. Dev. and Mean of Key by Genre

What is the relationship between a song's genre and its key?

Interesting Takeaways

  1. The average key of the songs by genre appears not to differ significantly.

  2. The average standard deviation of the songs by genre appears not to differ significantly.

Data Cleaned

  • Genres in the dataset were included if they contain over 130 songs (top 10 genres)


To clarify, the key of the songs in the dataset was estimated on the scale of 0 - 11, rather than a conventional key represented with a note. Similarly to the comparison of mean BPM by genre, mean key by genre does not differ very much. However, the standard deviation of the key varies less per genre than the standard deviation of the BPM does. This means that all genres vary pretty similarly by how different the key is which their songs are in. For the average music listener, all this means is that genres aren't typically recognizable by the key a song is in and vice versa.

Colored bar chart with same genres as the bpm charts. Represents the average song key on a scale from zero to eleven, with all genres around 5.
Colored bar chart with same genres as the bpm charts. Represents the standard deviation of song key on a scale from zero to eleven, with all genres around 3.0 or 3.5.

Genre Wordmap

What are some of the most popular genres included within this dataset?

Interesting Takeaways

  1. The dataset seems to skew modern with hip hop, dance pop, and alternative rock being some of the most popular genres, with alt-rock in particular only entering the top ten most popular genres around 1990 according to our timeline.

  2. Out of all the genres that dominated the timeline, only older styles such as jazz and opera are not represented. Country, rock, soul (as neosoul), and R&B (Motown) are all popular categories in the dataset.

Data Cleaned

  • Top songs were ordered by 'hotness'. The corresponding genres were selected from the top 100 songs.

  • Formatting for artist name and genre was concatenated for use in the wordcloud libary.


Music genre popularity provides an insight into the zeitgeist of any given time. The lyrics and mood of popular songs often reflect public interest. For example, the rise of gangster rap and hip hop (dominated by black artists) since the 90s has led to the increased prominence of Black culture in the United States. Despite music genres as varied as jazz, soul, and rock and roll all originating from Black musicians, it is only recently that the music community in the United States has truly reflected that diversity.

Word cloud containing music genres and subgenres based on popularity in the dataset. Hip hop, dance pop, and rock (alt, piano, heartland) are the largest words. Gangster rap, country, Motown, neosoul, and techno are the next most popular.

Methodology

The data collection for this project was precise and extensive. We went to the CORGIS datasets and then into the specific sets of Python data. We chose the "Music" library, which gave us all the information we needed in order to complete our research. It provided us with a compilation of over one million songs with information regarding their audio features and metadata. From there, we needed to arrange our data. We wanted to organize our data in a way that would compare each given description from the dataset. To do so, we compared the different genres, their song lengths and keys, and their popularity over time with one another to see what kind of changes we could find. Since this dataset was so thorough, we had all the needed information without requiring a search for other datasets to complete the timelines. However, we found a YouTube video that solidified and supported our findings from the dataset, which confirmed that we had all the data we needed to create our visualizations.

With a wide range of a million songs, we wanted the specifics of how music has changed over time. Music has clearly changed with respect to societal standards throughout the years. This shift in music production due to historical events that took place helped explain why certain genres or songs were popular when they were. We examined the characteristics of music to see each form of change that took place. In order to present our data in a visually appealing manner, we utilized GitHub and Python to compare the variety of topics we focused on. We could easily see the most prominent genres in our dataset by constructing a word cloud. We also created a world map that uses dots to show the location of the artists that created each song provided in the database. Our dot plot visual found the relation of song duration, in seconds, with each year to better understand the shift in increasing length as time progresses. We created another visualization that looks at the correlation between the keys of the song and its popularity. We then continued by representing tempo in beats per minute (BPM) per genre from our dataset with a bar plot. Our array of visuals mapped and presented our findings in an easily digestible form, allowing viewers to get an overall understanding with a simple glance.

However, this representation is by no means perfect. Genres can be subjective; many would argue that genres often blend together and form new subgenres. Yet, we feel as though our results are an accurate representation of the data we used. Genres do have the ability to influence song length and the key(s) used in a song but making generalizations about solely these factors gives insight into the psyche of listeners over the years. Artists often fit into many different but related genres, which is why we created the word cloud to highlight the most common subgenres. We categorized the many traits of our data to present the best overall depiction of popular music, from genre popularity to the niches of keys and song length.


Report abuse
Page details
Page updated
Report abuse