A multi-label emoji classification method using balanced pointwise mutual information-based feature selection

The availability of social media such as twitter allows users to express their feeling, emotions and opinions toward a topic. Emojis are graphic symbols that are regarded as the new generation of emoticons and an effective way of conveying feelings and emotions in social media. With the surging popu...

Full description

Saved in:
Bibliographic Details
Main Authors: Ahanin, Zahra, Ismail, Maizatul Akmar
Format: Article
Published: Academic Press Ltd- Elsevier Science Ltd 2022
Subjects:
Online Access:http://eprints.um.edu.my/33651/
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Malaya
id my.um.eprints.33651
record_format eprints
spelling my.um.eprints.336512022-07-21T02:24:23Z http://eprints.um.edu.my/33651/ A multi-label emoji classification method using balanced pointwise mutual information-based feature selection Ahanin, Zahra Ismail, Maizatul Akmar QA75 Electronic computers. Computer science The availability of social media such as twitter allows users to express their feeling, emotions and opinions toward a topic. Emojis are graphic symbols that are regarded as the new generation of emoticons and an effective way of conveying feelings and emotions in social media. With the surging popularity of Emojis, the researchers in the area of Emotion Classification strive to understand the emotion correlated to each Emoji. Two of the most the successful approaches in emoji analysis rely on: 1) official Unicode description and 2) manually built emoji lexicons. Since the use of emoji is socially determined, the former approach is not aligned with intended semantic and usage, which leads researchers to opt for emoji lexicons. To overcome problem of lexiconbased approach, we proposed a method to classify emojis automatically. Therefore, we present a modified Pointwise Mutual Information (PMI) method, called Balanced Pointwise Mutual Information-Based (B-PMI), to develop a balanced weighted emoji classification based on the semantic similarity. Further, deep neural network is used to represent emoji in vector form (emoji embedding) to extend the pre-trained word embeddings. We carefully evaluated the proposed method in multiple twitter datasets that are employed in sentiment and emotion classification using machine learning (ML) and deep learning (DL) approaches. In both approaches, extending word embedding with the proposed emoji embedding improved results. The DL-based approach achieved the highest f1-score of 70.01% for sentiment classification, and accuracy score of 56.36% for emotion classification. ML-based approach obtained accuracy score of 52.17% in emotion classification. Academic Press Ltd- Elsevier Science Ltd 2022-05 Article PeerReviewed Ahanin, Zahra and Ismail, Maizatul Akmar (2022) A multi-label emoji classification method using balanced pointwise mutual information-based feature selection. Computer Speech & Language, 73. ISSN 0885-2308, DOI https://doi.org/10.1016/j.csl.2021.101330 <https://doi.org/10.1016/j.csl.2021.101330>. 10.1016/j.csl.2021.101330
institution Universiti Malaya
building UM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Malaya
content_source UM Research Repository
url_provider http://eprints.um.edu.my/
topic QA75 Electronic computers. Computer science
spellingShingle QA75 Electronic computers. Computer science
Ahanin, Zahra
Ismail, Maizatul Akmar
A multi-label emoji classification method using balanced pointwise mutual information-based feature selection
description The availability of social media such as twitter allows users to express their feeling, emotions and opinions toward a topic. Emojis are graphic symbols that are regarded as the new generation of emoticons and an effective way of conveying feelings and emotions in social media. With the surging popularity of Emojis, the researchers in the area of Emotion Classification strive to understand the emotion correlated to each Emoji. Two of the most the successful approaches in emoji analysis rely on: 1) official Unicode description and 2) manually built emoji lexicons. Since the use of emoji is socially determined, the former approach is not aligned with intended semantic and usage, which leads researchers to opt for emoji lexicons. To overcome problem of lexiconbased approach, we proposed a method to classify emojis automatically. Therefore, we present a modified Pointwise Mutual Information (PMI) method, called Balanced Pointwise Mutual Information-Based (B-PMI), to develop a balanced weighted emoji classification based on the semantic similarity. Further, deep neural network is used to represent emoji in vector form (emoji embedding) to extend the pre-trained word embeddings. We carefully evaluated the proposed method in multiple twitter datasets that are employed in sentiment and emotion classification using machine learning (ML) and deep learning (DL) approaches. In both approaches, extending word embedding with the proposed emoji embedding improved results. The DL-based approach achieved the highest f1-score of 70.01% for sentiment classification, and accuracy score of 56.36% for emotion classification. ML-based approach obtained accuracy score of 52.17% in emotion classification.
format Article
author Ahanin, Zahra
Ismail, Maizatul Akmar
author_facet Ahanin, Zahra
Ismail, Maizatul Akmar
author_sort Ahanin, Zahra
title A multi-label emoji classification method using balanced pointwise mutual information-based feature selection
title_short A multi-label emoji classification method using balanced pointwise mutual information-based feature selection
title_full A multi-label emoji classification method using balanced pointwise mutual information-based feature selection
title_fullStr A multi-label emoji classification method using balanced pointwise mutual information-based feature selection
title_full_unstemmed A multi-label emoji classification method using balanced pointwise mutual information-based feature selection
title_sort multi-label emoji classification method using balanced pointwise mutual information-based feature selection
publisher Academic Press Ltd- Elsevier Science Ltd
publishDate 2022
url http://eprints.um.edu.my/33651/
_version_ 1739828465403691008