Music4All: A New Music Database and Its Applications

One of the goals of the music information retrieval (MIR) community is to research new methods and create new systems that can efficiently and effectively retrieve and recommend songs from large databases of music content. Despite the volume of research in the area, there is a lack of music databases to support these works, i.e. databases that comply with some quite desirable requirements for the development of researches, such as: a huge amount of music pieces, the audio signal availability and a great diversity of audio attributes. In order to contribute to the MIR community, we present Music4All, a new music database which contains metadata, tags, genre information, 30-seconds audio clips, lyrics, and so on. The process used to create the database is presented in the following figure.

The database contains 15,602 anonymous users, their listening histories, and 109,269 songs represented by their audio clips, lyrics, and 16 other metadata/attributes, as follows:

  • id: Unique 16 characters identifier for the song in the database.

  • artist: Name of the artist that published the song in last.fm. There are 16,269 unique artists in the database.

  • song: Name of the song.

  • lang: Language assigned to the lyrics by the tool lang detect. There are 46 unique languages in the database.

  • spotifyid: Song identifier in the Spotify application.

  • popularity: Integer value ranging from 0 to 100 representing how popular a song is on Spotify. This value is based on the total number of plays the song has and how recent those plays are.

  • album name: Name of the album that the song is in. There are 38,363 different albums in the database.

  • release: Year in which the song was released.

  • danceability: Real value ranging from 0.0 to 1.0 representing how suitable the song is for dancing. This value is based on a combination of musical elements, provided by the Spotify API.

  • energy: Real value ranging from 0.0 to 1.0 provide by the Spotify API that is a perceptual measure of intensity and activity.

  • key: Overall key of the song, using standard Pitch Class notation, provided by the Spotify API.

  • mode: Binary value provided by the Spotify API corresponding to the modality of the song, where major is represented by 1 and minor by 0.

  • valence: Real value ranging from 0.0 to 1.0 provided by the Spotify API that measures how positive a song is.

  • tempo: Speed or pace of the song, measured in beats per minute (BPM), provided by the Spotify API.

  • genres: List of genres tags associated with the song. There are 853 unique genre tags in the database.

  • tags: User-given tags from the last.fm application, with 19,541 unique tags.


More information about the Music4All database can be found in:


To request the Music4All database, send an e-mail to contact4music4all@gmail.com.


How to cite the database: Igor André Pegoraro Santana and Fabio Pinhelli and Juliano Donini and Leonardo Catharin and Rafael Biazus Mangolin and Yandre Maldonado e Gomes da Costa and Valéria Delisandra Feltrim and Marcos Aurélio Domingues. Music4All: A New Music Database and its Applications. In: 27th International Conference on Systems, Signals and Image Processing (IWSSIP 2020), 2020, Niterói, Brazil. p. 1-6.