Using chords to classify songs as happy or sad - mmoskun/ANN-final-project GitHub Wiki

Inspiration My friends always joke that they never see me without my headphones on, and they might be right – I’m almost always listening to music. I love how no matter how I’m feeling, there is always a song that matches that emotion. Thinking about this made me wonder about how music conveys emotion. Lyrics are definitely part of the equation, but sometimes I spend months listening to a song that feels really happy, only to listen to the lyrics more closely and realize that they are actually really depressing. I decided that looking at which chords are used in certain songs and the order in which they are played could give an interesting look into where the emotional impact of a song comes from.

I decided to focus on the relatively opposite emotions of happy and sad, and build a binary classifier to evaluate songs. I got my data from ultimate-guitar.com, where users upload and rate lyrics paired with chords. I made a list of songs I considered happy and songs I considered sad, then recorded their chords. I decided to use 9 chords per song, since I felt that would be enough to capture most, if not all, the chords used in the song, as well as capture the patterns of chords. I included 15 different chords in the training data, which made it a challenge to find songs with these exact chords. Since the order of the chords is important and each chord is a kind of snapshot of the song at a moment in time, each chord is inside its own array within the larger song array. The X training data was an array of these songs while the Y training data is an array with a 1 indicating a sad song, and a 2 indicating a happy song.

Example of an ultimate-guitar page:

Why LSTM? Music is sequential, and I wanted to examine whether the order that the chords matters, not just which chords are used. This led me to choosing to use a Recurrent Neural Network for my model, since I wanted to evaluate whether a song is happy or sad based on the sequences of chords in each song. Because of the consistently high performance of LSTMs, I chose this type of RNN for my model.

The network The network is an LSTM with a single layer, with a 15-dimensional output space (the number of different chords used in the training data), using sigmoid activation. When I experimented with adding more layers, the accuracy of the model went down, so I decided to keep it simple. The network performed relatively well on the training data and test songs. Since I was recording the data by hand and was limited to the songs with the 15 chords I chose, I trained the model with only a few songs. When I attempted to expand the training set (adding about 20 new songs and 7 new chords), the performance of the network got a lot worse. I decided to stick with just a few songs and a few chords. It seems like every time a new chord is added, a lot more data is needed to achieve any kind of accuracy. Additionally, when I added more data and certain chords were common in both happy and sad songs, it seems like songs became more difficult to classify, because just the existence of a certain chord in a particular song did not indicate whether it was happy or sad. If I wanted to make my network more accurate, and add more chords, I would definitely need to acquire a much larger source of data to train it with.

Some of the network's classifications:

The code for my network can be found here: https://github.com/mmoskun/ANN-final-project/blob/master/finalproject_code.ipynb

and the results can be found here: https://github.com/mmoskun/ANN-final-project/blob/master/results

Further work In addition to gathering more data and making a more accurate and comprehensive network for classifying happy and sad songs, another step would be to expand the model to classify more nuanced emotions, like melancholy, hope, or excitement. This network could also be a jumping off point for creating a network that generates chord progressions for different emotions, rather than just predicting emotions based on existing patterns.

References
Code inspired by: https://www.youtube.com/watch?v=UnclHXZszpw Info on RNNs: http://karpathy.github.io/2015/05/21/rnn-effectiveness/ Chords from https://www.ultimate-guitar.com/