EMOTION CLASSIFICATION OF USER FACE IMAGE IN MUSIC RECOMMENDATION SYSTEM

A person could listen to different music when sad or happy. Listening to favorite music stimulates the brain to release dopamine hormone to the corpus striatum, which manages human feelings such as addiction, satisfaction, and motivation. So human emotion could be an opportunity to enhance the mu...

Full description

Saved in:
Bibliographic Details
Main Author: Surya Angkasa, Hengky
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/65784
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:A person could listen to different music when sad or happy. Listening to favorite music stimulates the brain to release dopamine hormone to the corpus striatum, which manages human feelings such as addiction, satisfaction, and motivation. So human emotion could be an opportunity to enhance the music recommendation system. Gilda et al. (2017) and Krupa et al. (2020) used Convolutional Neural Network (CNN) to classify emotion from face image with FER2013 dataset. The performance of Krupa’s CNN is not good yet and unequally to each emotion. While performance of Gilda’s CNN is good, but the convolution layers quite a lot (9 layers) with big filters (256). In a recommender system, the incomplete cold start could happen if a user has lack of rating. So the system needs extra information to give better recommendations. Thongsuwan et al. (2021) designed ConvXGB. Two convolution layers are used for extracting input and XGBoost does the learning task. The model has better performance than CNN in DrivFace dataset. So ConvXGB was implemented in this research for emotion classification. Emotion is mapped with the mood attribute in the music dataset (Emotify). The mood is the additional information of the user for solving incomplete cold start. The system recommends musics that are liked in particular mood by a user and sorted in descending. Recall metric is used to evaluate the recommendation. The performance of ConvXGB in emotion classification with 128 filters and a maxpooling is better than the two CNN in the oversampled dataset. ConvXGB gained 78.64% accuracy on 4 emotions and 80.99% on 7 emotions. Evaluation for incomplete cold start is using 10% to 90% of each user’s ratings in a mood as train data. From the average recall result, incomplete cold start can be solved with mood information using 10% of each user’s ratings. The system tends to have better performance with the increment of user rating data.