Easily analyse audio features from Spotify playlists — Part 1.

Published in

Analytics Vidhya

6 min readMar 8, 2021

A simple guide to getting audio features and preview audio files from Spotify playlists, using Python.

Music is a defining part of our lives. As music listening has predominantly become a digital and online activity it offers a great avenue for researchers to gain insights into why we listen to music. While the days of mixtapes are long gone, the process of collecting music still exists in the form of playlists. On Spotify playlists serves both as a way to organize your music and to create collections of tracks suited for different occasions.

Personally, I have playlists for multiple purposes and reasons. One contains tracks I listen to when working. Another contains a parade of nostalgia, and yet another holds songs to get my children to relax. If you search Spotify for playlists you’ll find thousands of hits on terms such as “sad songs”, “music for sleeping”, “dancing”.

These playlists offer a way to understand what characterises music for particular occasions. Are the tracks people use for sleeping quantitatively different from those used for dancing? Can you tell a person’s age by looking at their playlists?

These are just two examples of questions one can approach by looking at playlists.

In this post I will show how to use Python together with a simple script, the Generalized Spotify Analyser (GSA), to quickly get metadata and audio features from playlists. This post assumes that you have a certain familiarity with Python and programming, but I will try to make it as accessible as possible.

This post is a first “version” of the GSA, and as it develops I will update it accordingly. Last update was March 17th, 2021.

Part 1 covers getting up and running.
Part 2 shows how you can use GSA as part of a larger data collection.
Part 3 gives some examples of how to analyse audio features.

You can find the scripts used here on GitHub.

Prerequisites

Python 3

Go to https://www.python.org/downloads/ and follow the instructions there. Make sure to select at least Python 3.8 or newer.

A Spotify account

You’ll need both a Spotify account and a Spotify Developer account. Both can be had for free. Go to https://developer.spotify.com/dashboard/ to make one.

(Optional) An IDE

I use Spyder, an open-source IDE for programming in Python. You can use any other IDE, or none at all, but later parts of this series may be easier to follow if you use Spyder.

Getting the scripts

First, you need to get the GSA scripts. You can download them from here, by pressing “Code” and then “Download ZIP”. Unzip this to a directory of your choice.

Press Download ZIP on GitHub to get the scripts.

Getting API access

The next thing you need to do is to get API access. For that, you need to set up an app in the Spotify for Developers dashboard. What you name your app is unimportant. What you need is the following: a Client ID, a Client Secret, a redirect URL, and your Spotify username.

Once you’ve made an app, you can see a Client ID and (by pressing show client secret) a Client Secret. Press “edit settings” to find you redirect URIs, and add “http://example.com/callback/” if it isn’t already filled in. Once you have all this information, go to the folder you extracted the GSA scripts in.

In the GSA folder you’ll find a file called “spotifyConstants_template.py”. Edit this file, and paste in the information you found on the Spotify for Developers Dashboard. Now rename this file to “spotifyConstants.py”.

Installing requirements

The next step is installing the required packages that makes GSA work. This is predominantly regular packages you’d use in Python, with the addition of SpotiPy, a wrapper for Spotify’s API.

Using your command-line interface, navigate to the folder where you unzipped GSA then type:

pip install -r requirements.txt

Once this is finished you are now ready to test GSA!

Using GSA

There are two included example scripts in GSA. We’ll go through the basic example first (GSA_basicExample.py). This script shows you how to get audio features for a single playlist, and then download 30-second preview MP3s of the tracks in the playlist.

First we import GSA and pandas (a library for data manipulation in Python).

If you are using the command-line interface, you should now enter Python.

import GSA
import pandas as pd

Next we need to authenticate with the Spotify API.

GSA.authenticate()

The first time you run the script, this will open a webpage where you give permission for your script to use your credentials through Spotify’s system.Once you agree, you’ll be sent to another webpage. This is the redirect URI mentioned previously. Copy this address and paste it back into the console. You should now be authenticated!

If you have previously authenticated, GSA will refresh your token instead of creating a new one.

Getting playlist information

To get metadata and audio features from a playlist we need its ID. For this example, we’ll use the playlist Made in Norway. You can find the playlist ID by pressing share and selecting “Copy Spotify URI”. This will give you the following text: spotify:playlist:37i9dQZF1DX3hgbB9nrEB1. It is those last numbers we are interested in.

Image showing how to get the playlist ID

For a different approach to getting playlist IDs, see GSA_example.py.

We can now use the GSA.getInformation() function to query Spotify’s API about the playlist.

myPlaylist = GSA.getInformation(‘37i9dQZF1DX3hgbB9nrEB1’, verbose=True)

This creates a pickle file (.pkl) with all the information about the tracks in your playlist, and saves it to the subfolder “Playlists”. The verbose=True argument makes the function output the title of every track. To access the information in Python, we can now read it back by using pandas:

myPlaylistInformation = pd.read_pickle(myPlaylist)

Here we read the pickle file as a dataframe containing all the information, including audio features for each track in your playlist.

Example output, here shown in Spyder’s variable explorer.

Downloading preview MP3s

Most of the tracks at Spotify have an associated 30-second MP3 preview. These are especially useful if you want to have a quick listen to the music in a playlist, or if you want to run a different audio analysis.

I will address the correlations and links between Spotify’s audio features and more regular music information retrieval analysis in a later post.

To do so, you can use the GSA.downloadTracks() function. This function takes as input a list consisting of the following information: SampleURL, TrackName, TrackID, and playlistID. These are all present in the dataframe we previously created, so we can extract them from there.

toDownload = myPlaylistInformation[[‘SampleURL’, ‘TrackName’, ‘TrackID’, ‘playlistID’]].values.tolist()

Now we need to create a loop where we call GSA.downloadTracks() on each track. We also need to keep a count of which tracks were successfully downloaded.

# Create an array to keep track of which were successfully downloadeddownloaded = []# Now download preview MP3s, in a loop:for track in toDownload:
    print(‘Downloading track: ‘ + track[1])
    # this prints the current track name
    success = GSA.downloadTracks(track)
    downloaded.append(success)

The downloaded MP3s can be found in the Audio subfolder.

Summary

That concludes part 1! You should now have GSA installed, and be able to get information from a single playlist. In part 2 I will show you how you can use GSA as part of a larger data collection. In part 3 we do statistics and visualisations.

Analytics Vidhya

Easily analyse audio features from Spotify playlists — Part 1.

Prerequisites

Getting the scripts

Getting API access

Installing requirements

Using GSA

Getting playlist information

Downloading preview MP3s

Summary

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Published in Analytics Vidhya

Written by Ole Adrian Heggli

Responses (2)