Education Bazaar 2016

Predicting the next big hit?
Promises and limitations of big (music) data

Complimentary downloads for a lecture at EducationBazaar 

(Tilburg University, 15 September 2016)

Short description

Goal: Get some data and write some code to predict whether a song will end up in the Top 10 of the Dutch streaming charts.

Data: (chart positions) and Spotify's Web API (acoustic attributes)

Let's start coding!

(Downloads at the bottom of this page)

Requires R (for nerds; for Stata/SPSS lovers), a pretty cool statistical package. It's free, and there are wonderful tutorials out there to learn it. Curious? Check out this tutorial.

Material (downloads see below at this page):

1-3 are for starters
4-5 are for (soon-to-be) experts
  1. Slides
  2. Complete dataset (CSV file)
  3. R Code to perform the analysis
  4.  Python code to
    1. download chart data from   
    2. make API requests to Spotify’s Web API to obtain data on acoustic attributes
    3. store both data sets in a SQLite database
  5. code to merge 4.1 with 4.2
  6. Video recording of the lecture (starts around minute 36...)

Note: Python may be tough to learn if you do not have any prior experience. You also find good tutorials on Python herePlease read the instructions in the Python code carefully, as you are required to install extra packages to be able to run the scripts. Further, you need to have a Spotify account and register yourself as an App developer at Spotify (so that they grant you making API requests).


Happy coding!

Update 20-9-2016

I learnt that a PhD student at Tilburg already transferred the data/code to a public GitHub repository. Want to learn about GitHub? Check out this guide by a great organization called "The Software Carpentry" (giving lessons in scientific programming).

Directly want to contribute/check out the improved code? See here.

Hannes Datta,
Sep 15, 2016, 2:57 AM
Hannes Datta,
Sep 15, 2016, 3:20 AM
Hannes Datta,
Sep 15, 2016, 2:47 AM
Hannes Datta,
Sep 15, 2016, 2:42 AM