MELODY CLASSIFICATION OF POPULAR MUSIC USING ARTIFICIAL NEURAL NETWORK WITH BACKPROPAGATION
This research is about classification of melody from popular music, using artificial neural network (ANN) with backpropagation. The melodies are restricted to be only the vocal part of popular music. The parts of melody that are included in the dataset are the ones that related to the pitch and d...
Saved in:
Main Author: | |
---|---|
Format: | Final Project |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/76503 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
Summary: | This research is about classification of melody from popular music, using artificial neural
network (ANN) with backpropagation. The melodies are restricted to be only the vocal part of
popular music. The parts of melody that are included in the dataset are the ones that related to
the pitch and duration. The classification has three classes, good melody, ordinary melody, and
bad melody. The sets of melodies and the ratings used for network training are obtained from
an online questionnaire. Obtained data then transcribed into a music score format and
transposed into the scale of C. All the melodies need to be encoded into a form that can be
processed by the network. The encoding format is divided into general information of the
melody and note information. General information of the melody consists of parameters such
as tempo, time signature, and anacrusis. Every note information contains parameters such as
pitch of the note, accidentals of a note, octave position of a note, and duration of a note. In this
research, the deep learning toolkit from MATLAB is used. Mean squared error (MSE) is used
as the loss function. For algorithm of the backpropagation, gradient descent with momentum
and adaptive learning rate (traingdx) is used. The architecture of the network consists of 280
input neurons, a hidden layer that contains 7 neurons, and one output neuron. Sets of melodies
that are never been heard by the respondents and the network are used. The rating of testing
melodies are obtained through a second questionnaire. After the training, only 48% of the
prediction match the rating from the second questionnaire. It is suspected that the encoding
note by note format is not suitable for melody classification and the size of the training dataset
is not large enough. |
---|