What Is Project In The Mix?

Project In The Mix is an attempt to possible interesting and useful emergent data from a database of sampled songs, created by Saul Wyner, University of Michigan School of Information Master's student, for SI 618: Exploratory Data Analysis, Fall 2010.

Ok, So Why Is This Interesting?

Sampling, the action of using snippets of released audio in other songs, is a controversial but established practice in many music genres, such as electronica and hip hop. Sampling is at the forefront of intellectual property law, as many of the early debates and decisions over sampling as a process are now being considered in respect to other media intellectual property rights issues that have cropped up over the use of the Internet as a distribution mechanism.

Why Should I Care About Intellectual Property?

If you've ever used any type of online media, from DRM-laden iTunes songs to youtube videos of someone setting their genitals on fire, intellectual property is a constant concern. Currently, many of the larger international media corporations and regulatory agencies are attempting to draft internationally consistent laws regarding intellectual property, which may shape the future of the internet and internet media as we know it. Given that average internet bandwidth will only increase as time progresses, and that new methods of media distribution are being created all the time, the internet will likely be the most important media distribution method in the world.

What Can This Project Investigate?

One of the main issues in intellectual property and copyright law is fair use, the permission to freely use copyright protected materials in certain contexts. It's an inordinately complex issue, but the basic concept has to do with the idea that you can use a portion of something if your use of it does not cost the copyright holder in value- for example, if you quote a line of dialogue from an author in a book, it is unlikely that that quote will fulfill readers' need to read the whole text, which to do so they would need to purchase the whole text. However, if you 'quoted' an entire relevant chapter, that would possibly not be fair use, as perhaps a reader will be able to fully fulfill his needs from your quote, rather than purchasing the text.

Sampling has long been debated as a possible application of fair use, although the current legal rulings generally disagree, while still remaining indefinite. As it is, samples must be 'cleared' by the copyright holder, which generally means paying a lot of royalties to a label and artist, even for a relatively minor sample. The penalties for not clearing a sample can be dire, such as the infamous Bittersweet Symphony case. A full investigation into the relationships between sampling and sampled songs, over issues of popularity, sales, other sampling songs of the same sampled song, and others could possibly shed some light on the effects of sampling.

Issues to be addressed:

  • Uncleared Samples- Many current and past songs, especially early hip hop songs, have uncleared samples, which were never paid or announced in their release. As such, finding these samples is very difficult. Luckily, there are several collaborative efforts of fans to create databases of samples and sampling songs, such as the-breaks.com, where this information, although spotty, can be found.
  • Other Song Info- These databases generally only contain the basics about the sampled/sampling songs, and for an in-depth investigation, more information needs to be found. As such, there exist several general music database APIs which can provide this information.
  • Consistency- As user-contributed data, much of the sampling database information is somewhat inconsistent, and has occasional missing information. This must be dealt with, especially in consideration of interfacing with other music APIs. 

Resulting questions:

  • Are there notable trends in sampling?

  • Is there a reason why some songs are sampled?

  • Are sampling and sampled songs correlated in some way?