The Evolution of Pop Lyrics and a tale of two LDA’s

word cloud d3

Inspired by this amazing Paper, that used audio signalling processing to analyse 30 second clips, from around 17K pop songs, to understand the evolution of Pop music over the last 50 years. I thought it would be interesting to see if something similar could be done with Pop lyrics.

The Evolution of Pop paper explains how Latent Dirichlet Allocation (LDA) was used to describe musical topics in each song. These topics were based on the chord progressions, timbre, and harmonics in the song, as derived using audio signal processing techniques. The songs were then clustered together, according to these topics, to give 13 major genres of pop music. LDA is traditionally used in text analysis. I was interested to see if I could classify the same songs into the 13 major genres according to the lyrics in the songs.

Continue reading

Predicting gender from musical tastes

forest cp

Continuing on my mission to get better at Python I’ve been learning about the Pandas and sklearn libraries. I was looking for a challenge to use these libraries on and I had recently come across a nice lastfm data extract. The data contains around 360K users and a little demographic data on them (gender, age, country), and a count of listens they have had for each artist. I decided to try and build a model that could predict someone’s gender based on what they have been listening to.
Continue reading