Milestone 1: Basic Linear Model - mbabbott/cse517a_mbabbott-ccarlos_application_project GitHub Wiki

Milestone 1

README

  • We used python's scipy machine learning library to implement logistic linear classification with tenfold cross validation.

  • As a small group, our only task was milestone 1. We constructed a dataset using downloaded midi files and python. We used the built in scipy methods to perform the machine learning task.

  • Our biggest issue was finding a dataset of midi files that had the necessary information about the musical key of each piece. Because we could not find such a dataset, we manually appended the key information of a variety of pieces taken from a repository of midi transcriptions of J.S. Bach pieces.

  • Resources used:

  • To run the code, run the linModel10CV python file. In the code, it should be pointed at two .csv files: one should have bag-of-notes data from G Major pieces, the other from D Major pieces.

  • We are including the dataset csvs because they're small and they're our own custom aggregated data, if this is a bad idea, we can remove them. Downloading these .csv files is necessary for the code to run properly, but theoretically you could use any dataset formatted in the same way (for formatting information, see data). If you have a repository of appropriate MIDI files, you can also run midi2bagofratio in order to convert those files to bag-of-ratio csvs of the same format we've been using