MULTILABEL MOOD CLASSIFICATION BASED ON INDONESIAN SONG LYRICS FEATURES
Research related to the classification of the mood of Indonesian songs has been carried out by previous researchers. However, this study only carried out single label classification, where in fact music can consist of more than one mood label. The multilabel classification of musical moods has also...
Saved in:
Main Author: | |
---|---|
Format: | Theses |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/63735 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
Summary: | Research related to the classification of the mood of Indonesian songs has been carried out by previous researchers. However, this study only carried out single label classification, where in fact music can consist of more than one mood label. The multilabel classification of musical moods has also been carried out by previous researchers, but this study only used audio features. This study will perform a multi-label classification of Indonesian songs with lyrics only. Research on single label mood classification shows that the best results are only with the lyrics feature, which is superior to the audio feature and the combination of audio and lyrics. Song lyrics have a variety of language styles. The song that is inserted in the style of sarcasm changes the meaning and impression of the song. These problems can be handled with the detection of sarcasm.
This study builds a mood dataset for Indonesian songs. The dataset is obtained by crawling. The obtained dataset of lOOO songs was then annotated. Annotations need to be done before the data can be used for machine learning. Annotators were divided into two groups. Each group consists of 3 annotators where each group will annotate 5OO songs. The song will be labeled with a mood consisting of sad, happy, angry, relaxed. Reliability analysis between annotators for multilabel scenarios was carried out using the Krippendorff matrix. Krippendorff's alpha value for each group is below O.667 which can be interpreted as inter-rater reliability in low annotation. This is a challenge to create a mood classification model, because the agreement between humans is still weak.
This study uses a transformation problem to perform a multilabel mood classification based on features of Indonesian song lyrics. The methods used in the process of developing the multilabel classification model are Binary Relevance (BR) and Label Powerset (LP). This research also involves four feature extraction techniques, namely stylistic, TF-IDF, fasttext, and sarcasm. The sarcasm feature is done by building a sarcasm detection model.
The process of building a sarcasm model begins with building a sarcasm dataset obtained from a website that provides sarcasm sentences and non-sarcasm sentences. The sarcasm model is built using the CountVectorizer feature which converts text features into a vector representation. The classifier used for the
sarcasm model is SVC (Support Vector Classifier). The performance of this sarcasm model got an accuracy value of 95%.
Measurement of the evaluation of the mood classifier model using the Example Based Accuracy method. The best model is obtained through a combination of fasttext, Binary relevance and SVC methods with an accuracy of 0.804. The application of the sarkame feature with the sarcasm detection model on the multilabel mood classification based on the Indonesian song lyrics feature did not improve performance, but was lower.
|
---|