Emocnn: Encoding emotional expression from text to word vector and classifying emotions—a case study in thai social network conversation
We present EmoCNN, a collection of specially-trained word embedding layer and convolutional neural network model for the classification of conversational texts into 4 types of emotion. This model is part of a chatbot for depression evaluation. The difficulty in classifying emotion from conversationa...
Saved in:
Main Authors: | , , , , |
---|---|
Other Authors: | |
Format: | Article |
Published: |
2022
|
Subjects: | |
Online Access: | https://repository.li.mahidol.ac.th/handle/123456789/76979 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Mahidol University |
id |
th-mahidol.76979 |
---|---|
record_format |
dspace |
spelling |
th-mahidol.769792022-08-04T15:38:20Z Emocnn: Encoding emotional expression from text to word vector and classifying emotions—a case study in thai social network conversation Konlakorn Wongpatikaseree Yongyos Kaewpitakkun Sumeth Yuenyong Siriwon Matsuo Panida Yomaboot Siriraj Hospital Mahidol University Sirindhorn International Institute of Technology, Thammasat University PORDEEKUM.AI Engineering We present EmoCNN, a collection of specially-trained word embedding layer and convolutional neural network model for the classification of conversational texts into 4 types of emotion. This model is part of a chatbot for depression evaluation. The difficulty in classifying emotion from conversational text is that most word embeddings are trained with emotionally-neutral corpus such as Wikipedia or news articles, where emotional words do not appear very often or at all, and the language style is formal writing. We trained a new word embedding based on the word2vec architecture in an unsupervised manner and then fine-tuned it on soft-labelled data. The data was obtained from mining Twitter using emotion keywords. We show that this emotion word embedding can differentiate between words which have the same polarity and words which have opposite polarity, as well as find similar words with the same polarity, while the standard word embedding cannot. We then used this new embedding as the first layer of EmoCNN that classifies conversational text into the 4 emotions. EmoCNN achieved macro-averaged f1-score of 0.76 over the test set. We compared EmoCNN against three different models: a shallow fully-connected neural network, fine-tuning RoBERTa, and ULMFit. These got the best macro-averaged f1-score of 0.5556, 0.6402 and 0.7386 respectively. 2022-08-04T08:38:20Z 2022-08-04T08:38:20Z 2021-01-01 Article Engineering Journal. Vol.25, No.7 (2021), 73-82 10.4186/ej.2021.25.7.73 01258281 2-s2.0-85112480821 https://repository.li.mahidol.ac.th/handle/123456789/76979 Mahidol University SCOPUS https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85112480821&origin=inward |
institution |
Mahidol University |
building |
Mahidol University Library |
continent |
Asia |
country |
Thailand Thailand |
content_provider |
Mahidol University Library |
collection |
Mahidol University Institutional Repository |
topic |
Engineering |
spellingShingle |
Engineering Konlakorn Wongpatikaseree Yongyos Kaewpitakkun Sumeth Yuenyong Siriwon Matsuo Panida Yomaboot Emocnn: Encoding emotional expression from text to word vector and classifying emotions—a case study in thai social network conversation |
description |
We present EmoCNN, a collection of specially-trained word embedding layer and convolutional neural network model for the classification of conversational texts into 4 types of emotion. This model is part of a chatbot for depression evaluation. The difficulty in classifying emotion from conversational text is that most word embeddings are trained with emotionally-neutral corpus such as Wikipedia or news articles, where emotional words do not appear very often or at all, and the language style is formal writing. We trained a new word embedding based on the word2vec architecture in an unsupervised manner and then fine-tuned it on soft-labelled data. The data was obtained from mining Twitter using emotion keywords. We show that this emotion word embedding can differentiate between words which have the same polarity and words which have opposite polarity, as well as find similar words with the same polarity, while the standard word embedding cannot. We then used this new embedding as the first layer of EmoCNN that classifies conversational text into the 4 emotions. EmoCNN achieved macro-averaged f1-score of 0.76 over the test set. We compared EmoCNN against three different models: a shallow fully-connected neural network, fine-tuning RoBERTa, and ULMFit. These got the best macro-averaged f1-score of 0.5556, 0.6402 and 0.7386 respectively. |
author2 |
Siriraj Hospital |
author_facet |
Siriraj Hospital Konlakorn Wongpatikaseree Yongyos Kaewpitakkun Sumeth Yuenyong Siriwon Matsuo Panida Yomaboot |
format |
Article |
author |
Konlakorn Wongpatikaseree Yongyos Kaewpitakkun Sumeth Yuenyong Siriwon Matsuo Panida Yomaboot |
author_sort |
Konlakorn Wongpatikaseree |
title |
Emocnn: Encoding emotional expression from text to word vector and classifying emotions—a case study in thai social network conversation |
title_short |
Emocnn: Encoding emotional expression from text to word vector and classifying emotions—a case study in thai social network conversation |
title_full |
Emocnn: Encoding emotional expression from text to word vector and classifying emotions—a case study in thai social network conversation |
title_fullStr |
Emocnn: Encoding emotional expression from text to word vector and classifying emotions—a case study in thai social network conversation |
title_full_unstemmed |
Emocnn: Encoding emotional expression from text to word vector and classifying emotions—a case study in thai social network conversation |
title_sort |
emocnn: encoding emotional expression from text to word vector and classifying emotions—a case study in thai social network conversation |
publishDate |
2022 |
url |
https://repository.li.mahidol.ac.th/handle/123456789/76979 |
_version_ |
1763488201375219712 |