Online news analytics based on AI techniques

Text classification is an important technique in the field of Natural Language Processing (NLP). Using this technology, we can efficiently extract the types of text materials that we are interested in from massive texts, which can greatly improve the efficiency of our work and facilitate our live...

Full description

Saved in:
Bibliographic Details
Main Author: Wei, Zhifeng
Other Authors: Mao Kezhi
Format: Thesis-Master by Coursework
Language:English
Published: Nanyang Technological University 2022
Subjects:
Online Access:https://hdl.handle.net/10356/156153
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-156153
record_format dspace
spelling sg-ntu-dr.10356-1561532023-07-04T17:49:40Z Online news analytics based on AI techniques Wei, Zhifeng Mao Kezhi School of Electrical and Electronic Engineering EKZMao@ntu.edu.sg Engineering::Electrical and electronic engineering::Computer hardware, software and systems Text classification is an important technique in the field of Natural Language Processing (NLP). Using this technology, we can efficiently extract the types of text materials that we are interested in from massive texts, which can greatly improve the efficiency of our work and facilitate our lives. This Dissertation focus on the news classification task of natural disasters. First of all, a news data set with nearly 2000 articles is collected. Then, different Text Representation methods such as Bag of words (BOW), term frequency–inverse document frequency (TF-IDF) and Latent Dirichlet Allocation (LDA) are tested using different classifiers and their classification performance are compared. After that, some deep learning neural network such as CNN, LSTM and Transformer are used to perform classification tasks on the data set collected before and the classification performance of these models are compared. At the same time, the performance of the randomly initialized Word Embeddings, Word2vec, Glove, and Bert pre-trained models on this data set are analyzed and compared. This Dissertation uses python3 and Pytorch deep learning framework for experimental demonstration. Accuracy, precision, recall and f1 score are used as evaluation criteria. The demonstration results show that the Transformer model and the Bert pre-trained model are slightly better than other models for classification tasks on the dataset collected in this dissertation. Master of Science (Signal Processing) 2022-04-05T05:41:43Z 2022-04-05T05:41:43Z 2021 Thesis-Master by Coursework Wei, Z. (2021). Online news analytics based on AI techniques. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/156153 https://hdl.handle.net/10356/156153 en application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Electrical and electronic engineering::Computer hardware, software and systems
spellingShingle Engineering::Electrical and electronic engineering::Computer hardware, software and systems
Wei, Zhifeng
Online news analytics based on AI techniques
description Text classification is an important technique in the field of Natural Language Processing (NLP). Using this technology, we can efficiently extract the types of text materials that we are interested in from massive texts, which can greatly improve the efficiency of our work and facilitate our lives. This Dissertation focus on the news classification task of natural disasters. First of all, a news data set with nearly 2000 articles is collected. Then, different Text Representation methods such as Bag of words (BOW), term frequency–inverse document frequency (TF-IDF) and Latent Dirichlet Allocation (LDA) are tested using different classifiers and their classification performance are compared. After that, some deep learning neural network such as CNN, LSTM and Transformer are used to perform classification tasks on the data set collected before and the classification performance of these models are compared. At the same time, the performance of the randomly initialized Word Embeddings, Word2vec, Glove, and Bert pre-trained models on this data set are analyzed and compared. This Dissertation uses python3 and Pytorch deep learning framework for experimental demonstration. Accuracy, precision, recall and f1 score are used as evaluation criteria. The demonstration results show that the Transformer model and the Bert pre-trained model are slightly better than other models for classification tasks on the dataset collected in this dissertation.
author2 Mao Kezhi
author_facet Mao Kezhi
Wei, Zhifeng
format Thesis-Master by Coursework
author Wei, Zhifeng
author_sort Wei, Zhifeng
title Online news analytics based on AI techniques
title_short Online news analytics based on AI techniques
title_full Online news analytics based on AI techniques
title_fullStr Online news analytics based on AI techniques
title_full_unstemmed Online news analytics based on AI techniques
title_sort online news analytics based on ai techniques
publisher Nanyang Technological University
publishDate 2022
url https://hdl.handle.net/10356/156153
_version_ 1772825650933006336