Music Composer Recognition from MIDI Representation using Deep Learning and N-gram Based Methods
MetadataShow full item record
In order to answer conceptually basic queries like “Who created this piece?” the discipline of computational musicology frequently requires the analysis of detailed characteristics. Melodic lines, rhythmic patterns, chords and chord progressions, tonality, and cadenzas, for example, are all employed. It is feasible to create algo- rithms that recognise these traits in symbolic data, but it is challenging and takes a lot of expertise. Such an algorithm is significantly more difficult to implement on audio recordings. In the last 10 years, however, machine learning research has enabled various attempts to automatically recreate important audio properties, such as Deep Belief Networks (DBN) and variants of Convolutional Neural Networks (CNN). In this thesis, I implemented several Deep Learning models and an N-grams model for composer recognition. I used different types of features such as Mel- Spectrograms and Mel-Frequency Cepstral Coefficient (MFCC) with different models such as Resnet, Squeezenet, and Alexnet. My goal was to test alternative approaches to categorising works of Western classical music in order to better understand the composer’s most distinguishing characteristics. I created a visual challenge by treat- ing these features as images. I incorporated certain non-traditional methods, such as N-grams for composer recognition, which is methodology adapted from processing natural languages. Some baseline machine learning methods, such as the Na ̈ıve Bayes method, are also evaluated. Overall, the best performing method was the Squeezenet model using Mel-Spectograms achieving 93% accuracy for three composers. The N-gram method also provided in some cases high accuracy, such as 93% for three composers.