Easily analyse audio features from Spotify playlists — Part 1.
A simple guide to getting audio features and preview audio files from Spotify playlists, using Python.

Music is a defining part of our lives. As music listening has predominantly become a digital and online activity it offers a great avenue for researchers to gain insights into why we listen to music. While the days of mixtapes are long gone, the process of collecting music still exists in the form of playlists. On Spotify playlists serves both as a way to organize your music and to create collections of tracks suited for different occasions.
Personally, I have playlists for multiple purposes and reasons. One contains tracks I listen to when working. Another contains a parade of nostalgia, and yet another holds songs to get my children to relax. If you search Spotify for playlists you’ll find thousands of hits on terms such as “sad songs”, “music for sleeping”, “dancing”.
These playlists offer a way to understand what characterises music for particular occasions. Are the tracks people use for sleeping quantitatively different from those used for dancing? Can you tell a person’s age by looking at their playlists?
These are just two examples of questions one can approach by looking at playlists.
In this post I will show how to use Python together with a simple script, the Generalized Spotify Analyser (GSA), to quickly get metadata and audio features from playlists. This post assumes that you have a certain familiarity with Python and programming, but I will try to make it as accessible as possible.
This post is a first “version” of the GSA, and as it develops I will update it accordingly. Last update was March 17th, 2021.
- Part 1 covers getting up and running.
- Part 2 shows how you can use GSA as part of a larger data collection.
- Part 3 gives some examples of how to analyse audio features.
You can find the scripts used here on GitHub.
Prerequisites
- Python 3
Go to https://www.python.org/downloads/ and follow the instructions there. Make sure to select at least Python 3.8 or newer.
- A Spotify account
You’ll need both a Spotify account and a Spotify Developer account. Both can be had for free. Go to https://developer.spotify.com/dashboard/ to make one.
- (Optional) An IDE
I use Spyder, an open-source IDE for programming in Python. You can use any other IDE, or none at all, but later parts of this series may be easier to follow if you use Spyder.
Getting the scripts
First, you need to get the GSA scripts. You can download them from here, by pressing “Code” and then “Download ZIP”. Unzip this to a directory of your choice.

Getting API access
The next thing you need to do is to get API access. For that, you need to set up an app in the Spotify for Developers dashboard. What you name your app is unimportant. What you need is the following: a Client ID, a Client Secret, a redirect URL, and your Spotify username.
Once you’ve made an app, you can see a Client ID and (by pressing show client secret) a Client Secret. Press “edit settings” to find you redirect URIs, and add “http://example.com/callback/” if it isn’t already filled in. Once you have all this information, go to the folder you extracted the GSA scripts in.
In the GSA folder you’ll find a file called “spotifyConstants_template.py”. Edit this file, and paste in the information you found on the Spotify for Developers Dashboard. Now rename this file to “spotifyConstants.py”.
Installing requirements
The next step is installing the required packages that makes GSA work. This is predominantly regular packages you’d use in Python, with the addition of SpotiPy, a wrapper for Spotify’s API.
Using your command-line interface, navigate to the folder where you unzipped GSA then type:
pip install -r requirements.txt
Once this is finished you are now ready to test GSA!
Using GSA
There are two included example scripts in GSA. We’ll go through the basic example first (GSA_basicExample.py). This script shows you how to get audio features for a single playlist, and then download 30-second preview MP3s of the tracks in the playlist.
First we import GSA and pandas (a library for data manipulation in Python).
If you are using the command-line interface, you should now enter Python.
import GSA
import pandas as pd
Next we need to authenticate with the Spotify API.
GSA.authenticate()
The first time you run the script, this will open a webpage where you give permission for your script to use your credentials through Spotify’s system.Once you agree, you’ll be sent to another webpage. This is the redirect URI mentioned previously. Copy this address and paste it back into the console. You should now be authenticated!
If you have previously authenticated, GSA will refresh your token instead of creating a new one.
Getting playlist information
To get metadata and audio features from a playlist we need its ID. For this example, we’ll use the playlist Made in Norway. You can find the playlist ID by pressing share and selecting “Copy Spotify URI”. This will give you the following text: spotify:playlist:37i9dQZF1DX3hgbB9nrEB1. It is those last numbers we are interested in.

For a different approach to getting playlist IDs, see GSA_example.py.
We can now use the GSA.getInformation() function to query Spotify’s API about the playlist.
myPlaylist = GSA.getInformation(‘37i9dQZF1DX3hgbB9nrEB1’, verbose=True)
This creates a pickle file (.pkl) with all the information about the tracks in your playlist, and saves it to the subfolder “Playlists”. The verbose=True argument makes the function output the title of every track. To access the information in Python, we can now read it back by using pandas:
myPlaylistInformation = pd.read_pickle(myPlaylist)
Here we read the pickle file as a dataframe containing all the information, including audio features for each track in your playlist.

Downloading preview MP3s
Most of the tracks at Spotify have an associated 30-second MP3 preview. These are especially useful if you want to have a quick listen to the music in a playlist, or if you want to run a different audio analysis.
I will address the correlations and links between Spotify’s audio features and more regular music information retrieval analysis in a later post.
To do so, you can use the GSA.downloadTracks() function. This function takes as input a list consisting of the following information: SampleURL, TrackName, TrackID, and playlistID. These are all present in the dataframe we previously created, so we can extract them from there.
toDownload = myPlaylistInformation[[‘SampleURL’, ‘TrackName’, ‘TrackID’, ‘playlistID’]].values.tolist()
Now we need to create a loop where we call GSA.downloadTracks() on each track. We also need to keep a count of which tracks were successfully downloaded.
# Create an array to keep track of which were successfully downloadeddownloaded = []# Now download preview MP3s, in a loop:for track in toDownload:
print(‘Downloading track: ‘ + track[1])
# this prints the current track name
success = GSA.downloadTracks(track)
downloaded.append(success)
The downloaded MP3s can be found in the Audio subfolder.
Summary
That concludes part 1! You should now have GSA installed, and be able to get information from a single playlist. In part 2 I will show you how you can use GSA as part of a larger data collection. In part 3 we do statistics and visualisations.