When Spotify bought Echonest, several years ago, they added a set of skills and knowledge in how computing can be used to analyse music. This field of work is fundamental to recommendation systems.
Conventionally researchers had taken one of two approaches; either try to understand how listeners express previous preferences, individually and collectively, and project that historical data into the future, or, try to find characteristics in the audio that cluster with previous preferences, and find new music that seems similar.
The Echonest team had added a new element, which seemed to me at the time (late 1990s) very insightful. Noticing how groups of listeners bent and stretched the definitions of musical genre they started looking at human evaluations as a filter through which to see what the computers were saying about the audio.
The humans in the mix might be considered ‘experts’; after all we would surely not want to cede our cultural definitions to computer science too completely. Now, after a few years of being the biggest music recommendation platform in the world, Spotify employs experts to add tracks to playlists, and thereby not only decides which artists get paid, but also which music is considered typical of most of the genres in global pop music.
In a music market that has, with a lot of help from Spotify, generated huge amounts more new music than its subscribers want to listen to, the more mechanical parts of the recommendation toolkit have gained new importance, for musicians and for culture. For artists new to the system then, the audio analysis is acting partly as a gatekeeper while the other signals are built up (or not, as the case may be).
So it’s worth trying to understand what’s going on in the datacentres, as new audio files hit algorithms and spit out analysis data that will decide which musicians are going to fail fast and which might go on to bigger things. And one of the parameters of a track that Spotify probably uses is what they call ‘danceability’.
Spotify exposes its danceability index as a floating point number between 0 and 1 via their developer API. What do we know about it, and how is it used? Here’s how they describe it:
Danceability describes how suitable a track is for dancing based on a combination of musical elements including tempo, rhythm stability, beat strength, and overall regularity. A value of 0.0 is least danceable and 1.0 is most danceable.
https://developer.spotify.com/documentation/web-api/reference/#endpoint-get-audio-features
This metric seems to have some importance in the ecosystem. In a publicly shared Tableau visualisation, it’s the measurement most strongly correlated with success across all of Spotify’s vast catalogue of music.
https://public.tableau.com/profile/greice7948#!/vizhome/50spotify/top50spotify
It is undiscoverable whether music that scores highly on this index is inherently more popular with consumers, or whether it accumulates higher play counts because it is more recommended by Spotify. And we need to remember that familiarity drives preference in music more than preference driving playcounts, so it is vitally important to understand what music any consumer service pushes to the top.
It is also an unproven hypothesis that music scored by Spotify as more danceable is more danced to by consumers. Paradoxically one of history’s most famous pieces written for dancing to, Strauss’s Blue Danube Waltz, scores very low on danceability.
Due perhaps to high variability between bars and other temporal measures Spotify’s audio feature analyser seems to struggle to categorise some aspects of orchestral tracks. This version has a danceability score of 0.216.
Nobody rational would attempt to dance to The Blue Danube Waltz if they had no information about it other than Spotify’s danceability number.
So this opaque metric on close inspection looks really quite dysfunctional. But can it be done better, and if so how? Here’s one idea. With some simple experiments to measure how enthusiastically groups of people dance to different tracks this Spotify metric could be compared to a truth-based danceability index.
Groups of dancers could be asked to wear small unobtrusive monitors while they dance, with data collected over pervasive networks, such as 5G, and collated to capture the movements they make; whether they are more or less energetic, or rhythmic, and how closely they align with patterns in the music.
5G, wearables, and dancers could thus provide a truth-based counterpoint to Spotify’s opaque danceability metric. If anyone is in a position to collaborate on making this happen let me know!