Multi-level classification of long text based on convolutional neural network
With the rise of mobile Internet, new media has seen unprecedented development, and news existing in the network has also shown a lot of growth. How to quickly extract the required classification information from mass text data for decision-makers to analyze it has become the premise and an importan...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Master by Coursework |
Language: | English |
Published: |
Nanyang Technological University
2021
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/152902 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-152902 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1529022023-07-04T17:40:35Z Multi-level classification of long text based on convolutional neural network Xiao, Siwei Mao Kezhi School of Electrical and Electronic Engineering EKZMao@ntu.edu.sg Engineering::Electrical and electronic engineering::Control and instrumentation::Control engineering With the rise of mobile Internet, new media has seen unprecedented development, and news existing in the network has also shown a lot of growth. How to quickly extract the required classification information from mass text data for decision-makers to analyze it has become the premise and an important part of the further use of news text information. For example, a maritime safety agency may need to make predictions based on recent piracy-related topics, such as unemployment, oil prices, weather conditions, to determine whether security measures need to be strengthened. Given the above background, in this dissertation we proposed a system of mixed long and short hierarchical text combination classifier to accurately classify long text of various topics and backgrounds. Here the long text can be defined as articles exceed the limit of 512 words. Automated text classification has been considered as a vital method to manage and process a vast quantity of documents in digital forms that are widespread and continuously increasing. In general, text classification plays an important role in information extraction and summarization, text retrieval, and question-answering. This dissertation illustrates the text classification process using deep learning techniques. Firstly, crawler technology is used in the stage of text acquisition to acquire articles of related topics in batches. Secondly, CNN is used as a multi-level classifier, and RNN is used as the first-level classifier to be compared with CNN. Thirdly, according to the characteristics of text structure, text enhancement technology and data processing optimization are used to improve the accuracy of experiment. This method proposed by this dissertation achieves better results for multi-topic and multi-level text classification, and provides a reference method for the case of multiple and multi-classification of original web texts. The references cited cover the major theoretical issues and guide the researcher to interesting research directions. Master of Science (Computer Control and Automation) 2021-10-15T06:13:49Z 2021-10-15T06:13:49Z 2021 Thesis-Master by Coursework Xiao, S. (2021). Multi-level classification of long text based on convolutional neural network. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/152902 https://hdl.handle.net/10356/152902 en application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Electrical and electronic engineering::Control and instrumentation::Control engineering |
spellingShingle |
Engineering::Electrical and electronic engineering::Control and instrumentation::Control engineering Xiao, Siwei Multi-level classification of long text based on convolutional neural network |
description |
With the rise of mobile Internet, new media has seen unprecedented development, and news existing in the network has also shown a lot of growth. How to quickly extract the required classification information from mass text data for decision-makers to analyze it has become the premise and an important part of the further use of news text information. For example, a maritime safety agency may need to make predictions based on recent piracy-related topics, such as unemployment, oil prices, weather conditions, to determine whether security measures need to be strengthened. Given the above background, in this dissertation we proposed a system of mixed long and short hierarchical text combination classifier to accurately classify long text of various topics and backgrounds. Here the long text can be defined as articles exceed the limit of 512 words. Automated text classification has been considered as a vital method to manage and process a vast quantity of documents in digital forms that are widespread and continuously increasing. In general, text classification plays an important role in information extraction and summarization, text retrieval, and question-answering. This dissertation illustrates the text classification process using deep learning techniques. Firstly, crawler technology is used in the stage of text acquisition to acquire articles of related topics in batches. Secondly, CNN is used as a multi-level classifier, and RNN is used as the first-level classifier to be compared with CNN. Thirdly, according to the characteristics of text structure, text enhancement technology and data processing optimization are used to improve the accuracy of experiment. This method proposed by this dissertation achieves better results for multi-topic and multi-level text classification, and provides a reference method for the case of multiple and multi-classification of original web texts. The references cited cover the major theoretical issues and guide the researcher to interesting research directions. |
author2 |
Mao Kezhi |
author_facet |
Mao Kezhi Xiao, Siwei |
format |
Thesis-Master by Coursework |
author |
Xiao, Siwei |
author_sort |
Xiao, Siwei |
title |
Multi-level classification of long text based on convolutional neural network |
title_short |
Multi-level classification of long text based on convolutional neural network |
title_full |
Multi-level classification of long text based on convolutional neural network |
title_fullStr |
Multi-level classification of long text based on convolutional neural network |
title_full_unstemmed |
Multi-level classification of long text based on convolutional neural network |
title_sort |
multi-level classification of long text based on convolutional neural network |
publisher |
Nanyang Technological University |
publishDate |
2021 |
url |
https://hdl.handle.net/10356/152902 |
_version_ |
1772827402933633024 |