A multi-label emoji classification method using balanced pointwise mutual information-based feature selection
The availability of social media such as twitter allows users to express their feeling, emotions and opinions toward a topic. Emojis are graphic symbols that are regarded as the new generation of emoticons and an effective way of conveying feelings and emotions in social media. With the surging popu...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Published: |
Academic Press Ltd- Elsevier Science Ltd
2022
|
Subjects: | |
Online Access: | http://eprints.um.edu.my/33651/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Universiti Malaya |
id |
my.um.eprints.33651 |
---|---|
record_format |
eprints |
spelling |
my.um.eprints.336512022-07-21T02:24:23Z http://eprints.um.edu.my/33651/ A multi-label emoji classification method using balanced pointwise mutual information-based feature selection Ahanin, Zahra Ismail, Maizatul Akmar QA75 Electronic computers. Computer science The availability of social media such as twitter allows users to express their feeling, emotions and opinions toward a topic. Emojis are graphic symbols that are regarded as the new generation of emoticons and an effective way of conveying feelings and emotions in social media. With the surging popularity of Emojis, the researchers in the area of Emotion Classification strive to understand the emotion correlated to each Emoji. Two of the most the successful approaches in emoji analysis rely on: 1) official Unicode description and 2) manually built emoji lexicons. Since the use of emoji is socially determined, the former approach is not aligned with intended semantic and usage, which leads researchers to opt for emoji lexicons. To overcome problem of lexiconbased approach, we proposed a method to classify emojis automatically. Therefore, we present a modified Pointwise Mutual Information (PMI) method, called Balanced Pointwise Mutual Information-Based (B-PMI), to develop a balanced weighted emoji classification based on the semantic similarity. Further, deep neural network is used to represent emoji in vector form (emoji embedding) to extend the pre-trained word embeddings. We carefully evaluated the proposed method in multiple twitter datasets that are employed in sentiment and emotion classification using machine learning (ML) and deep learning (DL) approaches. In both approaches, extending word embedding with the proposed emoji embedding improved results. The DL-based approach achieved the highest f1-score of 70.01% for sentiment classification, and accuracy score of 56.36% for emotion classification. ML-based approach obtained accuracy score of 52.17% in emotion classification. Academic Press Ltd- Elsevier Science Ltd 2022-05 Article PeerReviewed Ahanin, Zahra and Ismail, Maizatul Akmar (2022) A multi-label emoji classification method using balanced pointwise mutual information-based feature selection. Computer Speech & Language, 73. ISSN 0885-2308, DOI https://doi.org/10.1016/j.csl.2021.101330 <https://doi.org/10.1016/j.csl.2021.101330>. 10.1016/j.csl.2021.101330 |
institution |
Universiti Malaya |
building |
UM Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Malaya |
content_source |
UM Research Repository |
url_provider |
http://eprints.um.edu.my/ |
topic |
QA75 Electronic computers. Computer science |
spellingShingle |
QA75 Electronic computers. Computer science Ahanin, Zahra Ismail, Maizatul Akmar A multi-label emoji classification method using balanced pointwise mutual information-based feature selection |
description |
The availability of social media such as twitter allows users to express their feeling, emotions and opinions toward a topic. Emojis are graphic symbols that are regarded as the new generation of emoticons and an effective way of conveying feelings and emotions in social media. With the surging popularity of Emojis, the researchers in the area of Emotion Classification strive to understand the emotion correlated to each Emoji. Two of the most the successful approaches in emoji analysis rely on: 1) official Unicode description and 2) manually built emoji lexicons. Since the use of emoji is socially determined, the former approach is not aligned with intended semantic and usage, which leads researchers to opt for emoji lexicons. To overcome problem of lexiconbased approach, we proposed a method to classify emojis automatically. Therefore, we present a modified Pointwise Mutual Information (PMI) method, called Balanced Pointwise Mutual Information-Based (B-PMI), to develop a balanced weighted emoji classification based on the semantic similarity. Further, deep neural network is used to represent emoji in vector form (emoji embedding) to extend the pre-trained word embeddings. We carefully evaluated the proposed method in multiple twitter datasets that are employed in sentiment and emotion classification using machine learning (ML) and deep learning (DL) approaches. In both approaches, extending word embedding with the proposed emoji embedding improved results. The DL-based approach achieved the highest f1-score of 70.01% for sentiment classification, and accuracy score of 56.36% for emotion classification. ML-based approach obtained accuracy score of 52.17% in emotion classification. |
format |
Article |
author |
Ahanin, Zahra Ismail, Maizatul Akmar |
author_facet |
Ahanin, Zahra Ismail, Maizatul Akmar |
author_sort |
Ahanin, Zahra |
title |
A multi-label emoji classification method using balanced pointwise mutual information-based feature selection |
title_short |
A multi-label emoji classification method using balanced pointwise mutual information-based feature selection |
title_full |
A multi-label emoji classification method using balanced pointwise mutual information-based feature selection |
title_fullStr |
A multi-label emoji classification method using balanced pointwise mutual information-based feature selection |
title_full_unstemmed |
A multi-label emoji classification method using balanced pointwise mutual information-based feature selection |
title_sort |
multi-label emoji classification method using balanced pointwise mutual information-based feature selection |
publisher |
Academic Press Ltd- Elsevier Science Ltd |
publishDate |
2022 |
url |
http://eprints.um.edu.my/33651/ |
_version_ |
1739828465403691008 |