Emocnn: Encoding emotional expression from text to word vector and classifying emotions—a case study in thai social network conversation

We present EmoCNN, a collection of specially-trained word embedding layer and convolutional neural network model for the classification of conversational texts into 4 types of emotion. This model is part of a chatbot for depression evaluation. The difficulty in classifying emotion from conversationa...

Full description

Saved in:
Bibliographic Details
Main Authors: Konlakorn Wongpatikaseree, Yongyos Kaewpitakkun, Sumeth Yuenyong, Siriwon Matsuo, Panida Yomaboot
Other Authors: Siriraj Hospital
Format: Article
Published: 2022
Subjects:
Online Access:https://repository.li.mahidol.ac.th/handle/123456789/76979
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Mahidol University
id th-mahidol.76979
record_format dspace
spelling th-mahidol.769792022-08-04T15:38:20Z Emocnn: Encoding emotional expression from text to word vector and classifying emotions—a case study in thai social network conversation Konlakorn Wongpatikaseree Yongyos Kaewpitakkun Sumeth Yuenyong Siriwon Matsuo Panida Yomaboot Siriraj Hospital Mahidol University Sirindhorn International Institute of Technology, Thammasat University PORDEEKUM.AI Engineering We present EmoCNN, a collection of specially-trained word embedding layer and convolutional neural network model for the classification of conversational texts into 4 types of emotion. This model is part of a chatbot for depression evaluation. The difficulty in classifying emotion from conversational text is that most word embeddings are trained with emotionally-neutral corpus such as Wikipedia or news articles, where emotional words do not appear very often or at all, and the language style is formal writing. We trained a new word embedding based on the word2vec architecture in an unsupervised manner and then fine-tuned it on soft-labelled data. The data was obtained from mining Twitter using emotion keywords. We show that this emotion word embedding can differentiate between words which have the same polarity and words which have opposite polarity, as well as find similar words with the same polarity, while the standard word embedding cannot. We then used this new embedding as the first layer of EmoCNN that classifies conversational text into the 4 emotions. EmoCNN achieved macro-averaged f1-score of 0.76 over the test set. We compared EmoCNN against three different models: a shallow fully-connected neural network, fine-tuning RoBERTa, and ULMFit. These got the best macro-averaged f1-score of 0.5556, 0.6402 and 0.7386 respectively. 2022-08-04T08:38:20Z 2022-08-04T08:38:20Z 2021-01-01 Article Engineering Journal. Vol.25, No.7 (2021), 73-82 10.4186/ej.2021.25.7.73 01258281 2-s2.0-85112480821 https://repository.li.mahidol.ac.th/handle/123456789/76979 Mahidol University SCOPUS https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85112480821&origin=inward
institution Mahidol University
building Mahidol University Library
continent Asia
country Thailand
Thailand
content_provider Mahidol University Library
collection Mahidol University Institutional Repository
topic Engineering
spellingShingle Engineering
Konlakorn Wongpatikaseree
Yongyos Kaewpitakkun
Sumeth Yuenyong
Siriwon Matsuo
Panida Yomaboot
Emocnn: Encoding emotional expression from text to word vector and classifying emotions—a case study in thai social network conversation
description We present EmoCNN, a collection of specially-trained word embedding layer and convolutional neural network model for the classification of conversational texts into 4 types of emotion. This model is part of a chatbot for depression evaluation. The difficulty in classifying emotion from conversational text is that most word embeddings are trained with emotionally-neutral corpus such as Wikipedia or news articles, where emotional words do not appear very often or at all, and the language style is formal writing. We trained a new word embedding based on the word2vec architecture in an unsupervised manner and then fine-tuned it on soft-labelled data. The data was obtained from mining Twitter using emotion keywords. We show that this emotion word embedding can differentiate between words which have the same polarity and words which have opposite polarity, as well as find similar words with the same polarity, while the standard word embedding cannot. We then used this new embedding as the first layer of EmoCNN that classifies conversational text into the 4 emotions. EmoCNN achieved macro-averaged f1-score of 0.76 over the test set. We compared EmoCNN against three different models: a shallow fully-connected neural network, fine-tuning RoBERTa, and ULMFit. These got the best macro-averaged f1-score of 0.5556, 0.6402 and 0.7386 respectively.
author2 Siriraj Hospital
author_facet Siriraj Hospital
Konlakorn Wongpatikaseree
Yongyos Kaewpitakkun
Sumeth Yuenyong
Siriwon Matsuo
Panida Yomaboot
format Article
author Konlakorn Wongpatikaseree
Yongyos Kaewpitakkun
Sumeth Yuenyong
Siriwon Matsuo
Panida Yomaboot
author_sort Konlakorn Wongpatikaseree
title Emocnn: Encoding emotional expression from text to word vector and classifying emotions—a case study in thai social network conversation
title_short Emocnn: Encoding emotional expression from text to word vector and classifying emotions—a case study in thai social network conversation
title_full Emocnn: Encoding emotional expression from text to word vector and classifying emotions—a case study in thai social network conversation
title_fullStr Emocnn: Encoding emotional expression from text to word vector and classifying emotions—a case study in thai social network conversation
title_full_unstemmed Emocnn: Encoding emotional expression from text to word vector and classifying emotions—a case study in thai social network conversation
title_sort emocnn: encoding emotional expression from text to word vector and classifying emotions—a case study in thai social network conversation
publishDate 2022
url https://repository.li.mahidol.ac.th/handle/123456789/76979
_version_ 1763488201375219712