Optimal feature selection for learning-based algorithms for sentiment classification
Sentiment classification is an important branch of cognitive computation—thus the further studies of properties of sentiment analysis is important. Sentiment classification on text data has been an active topic for the last two decades and learning-based methods are very popular and widely used in v...
Saved in:
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2021
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/149878 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-149878 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1498782021-05-25T01:11:55Z Optimal feature selection for learning-based algorithms for sentiment classification Wang, Zhaoxia Lin, Zhiping School of Electrical and Electronic Engineering Engineering::Computer science and engineering::Computing methodologies Machine Learning Feature Selection Sentiment classification is an important branch of cognitive computation—thus the further studies of properties of sentiment analysis is important. Sentiment classification on text data has been an active topic for the last two decades and learning-based methods are very popular and widely used in various applications. For learning-based methods, a lot of enhanced technical strategies have been used to improve the performance of the methods. Feature selection is one of these strategies and it has been studied by many researchers. However, an existing unsolved difficult problem is the choice of a suitable number of features for obtaining the best sentiment classification performance of the learning-based methods. Therefore, we investigate the relationship between the number of features selected and the sentiment classification performance of the learning-based methods. A new method for the selection of a suitable number of features is proposed in which the Chi Square feature selection algorithm is employed and the features are selected using a preset score threshold. It is discovered that there is a relationship between the logarithm of the number of features selected and the sentiment classification performance of the learning-based method, and it is also found that this relationship is independent of the learning-based method involved. The new findings in this research indicate that it is always possible for researchers to select the appropriate number of features for learning-based methods to obtain the best sentiment classification performance. This can guide researchers to select the proper features for optimizing the performance of learning-based algorithms. (A preliminary version of this paper received a Best Paper Award at the International Conference on Extreme Learning Machines 2018.) Accepted version 2021-05-25T01:11:55Z 2021-05-25T01:11:55Z 2020 Journal Article Wang, Z. & Lin, Z. (2020). Optimal feature selection for learning-based algorithms for sentiment classification. Cognitive Computation, 12, 238-248. https://dx.doi.org/10.1007/s12559-019-09669-5 1866-9964 https://hdl.handle.net/10356/149878 10.1007/s12559-019-09669-5 12 238 248 en Cognitive Computation © 2020 Springer Science+Business Media. This is a post-peer-review, pre-copyedit version of an article published in Cognitive Computation. The final authenticated version is available online at: http://dx.doi.org/10.1007/s12559-019-09669-5 application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Computer science and engineering::Computing methodologies Machine Learning Feature Selection |
spellingShingle |
Engineering::Computer science and engineering::Computing methodologies Machine Learning Feature Selection Wang, Zhaoxia Lin, Zhiping Optimal feature selection for learning-based algorithms for sentiment classification |
description |
Sentiment classification is an important branch of cognitive computation—thus the further studies of properties of sentiment analysis is important. Sentiment classification on text data has been an active topic for the last two decades and learning-based methods are very popular and widely used in various applications. For learning-based methods, a lot of enhanced technical strategies have been used to improve the performance of the methods. Feature selection is one of these strategies and it has been studied by many researchers. However, an existing unsolved difficult problem is the choice of a suitable number of features for obtaining the best sentiment classification performance of the learning-based methods. Therefore, we investigate the relationship between the number of features selected and the sentiment classification performance of the learning-based methods. A new method for the selection of a suitable number of features is proposed in which the Chi Square feature selection algorithm is employed and the features are selected using a preset score threshold. It is discovered that there is a relationship between the logarithm of the number of features selected and the sentiment classification performance of the learning-based method, and it is also found that this relationship is independent of the learning-based method involved. The new findings in this research indicate that it is always possible for researchers to select the appropriate number of features for learning-based methods to obtain the best sentiment classification performance. This can guide researchers to select the proper features for optimizing the performance of learning-based algorithms. (A preliminary version of this paper received a Best Paper Award at the International Conference on Extreme Learning Machines 2018.) |
author2 |
School of Electrical and Electronic Engineering |
author_facet |
School of Electrical and Electronic Engineering Wang, Zhaoxia Lin, Zhiping |
format |
Article |
author |
Wang, Zhaoxia Lin, Zhiping |
author_sort |
Wang, Zhaoxia |
title |
Optimal feature selection for learning-based algorithms for sentiment classification |
title_short |
Optimal feature selection for learning-based algorithms for sentiment classification |
title_full |
Optimal feature selection for learning-based algorithms for sentiment classification |
title_fullStr |
Optimal feature selection for learning-based algorithms for sentiment classification |
title_full_unstemmed |
Optimal feature selection for learning-based algorithms for sentiment classification |
title_sort |
optimal feature selection for learning-based algorithms for sentiment classification |
publishDate |
2021 |
url |
https://hdl.handle.net/10356/149878 |
_version_ |
1701270574670544896 |