Multi-level classification of long text based on convolutional neural network

With the rise of mobile Internet, new media has seen unprecedented development, and news existing in the network has also shown a lot of growth. How to quickly extract the required classification information from mass text data for decision-makers to analyze it has become the premise and an importan...

Full description

Saved in:
Bibliographic Details
Main Author: Xiao, Siwei
Other Authors: Mao Kezhi
Format: Thesis-Master by Coursework
Language:English
Published: Nanyang Technological University 2021
Subjects:
Online Access:https://hdl.handle.net/10356/152902
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-152902
record_format dspace
spelling sg-ntu-dr.10356-1529022023-07-04T17:40:35Z Multi-level classification of long text based on convolutional neural network Xiao, Siwei Mao Kezhi School of Electrical and Electronic Engineering EKZMao@ntu.edu.sg Engineering::Electrical and electronic engineering::Control and instrumentation::Control engineering With the rise of mobile Internet, new media has seen unprecedented development, and news existing in the network has also shown a lot of growth. How to quickly extract the required classification information from mass text data for decision-makers to analyze it has become the premise and an important part of the further use of news text information. For example, a maritime safety agency may need to make predictions based on recent piracy-related topics, such as unemployment, oil prices, weather conditions, to determine whether security measures need to be strengthened. Given the above background, in this dissertation we proposed a system of mixed long and short hierarchical text combination classifier to accurately classify long text of various topics and backgrounds. Here the long text can be defined as articles exceed the limit of 512 words. Automated text classification has been considered as a vital method to manage and process a vast quantity of documents in digital forms that are widespread and continuously increasing. In general, text classification plays an important role in information extraction and summarization, text retrieval, and question-answering. This dissertation illustrates the text classification process using deep learning techniques. Firstly, crawler technology is used in the stage of text acquisition to acquire articles of related topics in batches. Secondly, CNN is used as a multi-level classifier, and RNN is used as the first-level classifier to be compared with CNN. Thirdly, according to the characteristics of text structure, text enhancement technology and data processing optimization are used to improve the accuracy of experiment. This method proposed by this dissertation achieves better results for multi-topic and multi-level text classification, and provides a reference method for the case of multiple and multi-classification of original web texts. The references cited cover the major theoretical issues and guide the researcher to interesting research directions. Master of Science (Computer Control and Automation) 2021-10-15T06:13:49Z 2021-10-15T06:13:49Z 2021 Thesis-Master by Coursework Xiao, S. (2021). Multi-level classification of long text based on convolutional neural network. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/152902 https://hdl.handle.net/10356/152902 en application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Electrical and electronic engineering::Control and instrumentation::Control engineering
spellingShingle Engineering::Electrical and electronic engineering::Control and instrumentation::Control engineering
Xiao, Siwei
Multi-level classification of long text based on convolutional neural network
description With the rise of mobile Internet, new media has seen unprecedented development, and news existing in the network has also shown a lot of growth. How to quickly extract the required classification information from mass text data for decision-makers to analyze it has become the premise and an important part of the further use of news text information. For example, a maritime safety agency may need to make predictions based on recent piracy-related topics, such as unemployment, oil prices, weather conditions, to determine whether security measures need to be strengthened. Given the above background, in this dissertation we proposed a system of mixed long and short hierarchical text combination classifier to accurately classify long text of various topics and backgrounds. Here the long text can be defined as articles exceed the limit of 512 words. Automated text classification has been considered as a vital method to manage and process a vast quantity of documents in digital forms that are widespread and continuously increasing. In general, text classification plays an important role in information extraction and summarization, text retrieval, and question-answering. This dissertation illustrates the text classification process using deep learning techniques. Firstly, crawler technology is used in the stage of text acquisition to acquire articles of related topics in batches. Secondly, CNN is used as a multi-level classifier, and RNN is used as the first-level classifier to be compared with CNN. Thirdly, according to the characteristics of text structure, text enhancement technology and data processing optimization are used to improve the accuracy of experiment. This method proposed by this dissertation achieves better results for multi-topic and multi-level text classification, and provides a reference method for the case of multiple and multi-classification of original web texts. The references cited cover the major theoretical issues and guide the researcher to interesting research directions.
author2 Mao Kezhi
author_facet Mao Kezhi
Xiao, Siwei
format Thesis-Master by Coursework
author Xiao, Siwei
author_sort Xiao, Siwei
title Multi-level classification of long text based on convolutional neural network
title_short Multi-level classification of long text based on convolutional neural network
title_full Multi-level classification of long text based on convolutional neural network
title_fullStr Multi-level classification of long text based on convolutional neural network
title_full_unstemmed Multi-level classification of long text based on convolutional neural network
title_sort multi-level classification of long text based on convolutional neural network
publisher Nanyang Technological University
publishDate 2021
url https://hdl.handle.net/10356/152902
_version_ 1772827402933633024