Its been a while since I did something with Spark. Its one of Apache’s most contributed to open source projects, so I was keen to see what had been developed since I last had a play. One of the big changes has been the development of a side project Zeppelin. Zeppelin is an interactive development environment (IDE) for Spark, much like Hue is to Hive. My aim here was to try have a play with Zeppelin and see if I could use it to develop a machine learning process. I needed some data, and the obvious dataset would be something to do with Led Zeppelin. So I used the Spotify API to download Echo Nest audio features for the songs on all Led Zeppelin’s studio albums. My plan was to do some unsupervised clustering to group songs with similar audio features together.
In the past I’ve built apps with R Shiny, and I’ve also developed a few data visualisations with d3.js. Given that R Shiny is an R based Back End Server that renders a Front End in Java Script, it seemed like it would be possible to integrate a d3.js visualisation into an R Shiny App. After some quick research, it turns out that it is possible, this blog explains how to do it, and here is an example (please note this is hosted on Shiny.io and sometimes runs out of free hours each month)
A while back I created an R package to pull data out of the Spotify API and turn it into a d3.js visualisation. Here is the blogpost. I’ve started to teach myself Python and I’ve now re-built this process with it. The exciting part is, as it’s in Python I can use the Google App Engine to create an app that hosts the code online. That means anyone can generate a related artists visualisation. Hurray! Have a go yourself by following this link
To find out more about how its done read on…
It now includes getArtistsAlbums which takes the output from a getArtists search and finds the albums by that artist and outputs a data.frame. This can be followed up by a getAlbumsTracks which will find all the tracks from those albums and create a data.frame. Finally I’ve added visDiscography. This uses both those functions along with the get artist function to create an interactive visualisation of an artists discography.
This is my first post, so I needed some data to play with. I’ve been wanting to learn more about APIs so tackling the Spotify API seemed like a great place to start. I soon came across the related artists function in the API and that gave me a great idea. What if you could map out and visualise how your favourite artists relate to each other according to Spotify. It could be a useful way to discover new similar artists. A visual recommendation engine.