Online news analytics based on AI techniques
Text classification is an important technique in the field of Natural Language Processing (NLP). Using this technology, we can efficiently extract the types of text materials that we are interested in from massive texts, which can greatly improve the efficiency of our work and facilitate our live...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Master by Coursework |
Language: | English |
Published: |
Nanyang Technological University
2022
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/156153 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-156153 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1561532023-07-04T17:49:40Z Online news analytics based on AI techniques Wei, Zhifeng Mao Kezhi School of Electrical and Electronic Engineering EKZMao@ntu.edu.sg Engineering::Electrical and electronic engineering::Computer hardware, software and systems Text classification is an important technique in the field of Natural Language Processing (NLP). Using this technology, we can efficiently extract the types of text materials that we are interested in from massive texts, which can greatly improve the efficiency of our work and facilitate our lives. This Dissertation focus on the news classification task of natural disasters. First of all, a news data set with nearly 2000 articles is collected. Then, different Text Representation methods such as Bag of words (BOW), term frequency–inverse document frequency (TF-IDF) and Latent Dirichlet Allocation (LDA) are tested using different classifiers and their classification performance are compared. After that, some deep learning neural network such as CNN, LSTM and Transformer are used to perform classification tasks on the data set collected before and the classification performance of these models are compared. At the same time, the performance of the randomly initialized Word Embeddings, Word2vec, Glove, and Bert pre-trained models on this data set are analyzed and compared. This Dissertation uses python3 and Pytorch deep learning framework for experimental demonstration. Accuracy, precision, recall and f1 score are used as evaluation criteria. The demonstration results show that the Transformer model and the Bert pre-trained model are slightly better than other models for classification tasks on the dataset collected in this dissertation. Master of Science (Signal Processing) 2022-04-05T05:41:43Z 2022-04-05T05:41:43Z 2021 Thesis-Master by Coursework Wei, Z. (2021). Online news analytics based on AI techniques. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/156153 https://hdl.handle.net/10356/156153 en application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Electrical and electronic engineering::Computer hardware, software and systems |
spellingShingle |
Engineering::Electrical and electronic engineering::Computer hardware, software and systems Wei, Zhifeng Online news analytics based on AI techniques |
description |
Text classification is an important technique in the field of Natural Language
Processing (NLP). Using this technology, we can efficiently extract the types of
text materials that we are interested in from massive texts, which can greatly
improve the efficiency of our work and facilitate our lives.
This Dissertation focus on the news classification task of natural disasters. First
of all, a news data set with nearly 2000 articles is collected. Then, different
Text Representation methods such as Bag of words (BOW), term frequency–inverse
document frequency (TF-IDF) and Latent Dirichlet Allocation (LDA) are tested
using different classifiers and their classification performance are compared. After
that, some deep learning neural network such as CNN, LSTM and Transformer
are used to perform classification tasks on the data set collected before
and the classification performance of these models are compared. At the same
time, the performance of the randomly initialized Word Embeddings, Word2vec,
Glove, and Bert pre-trained models on this data set are analyzed and compared.
This Dissertation uses python3 and Pytorch deep learning framework for experimental
demonstration. Accuracy, precision, recall and f1 score are used as evaluation
criteria. The demonstration results show that the Transformer model and
the Bert pre-trained model are slightly better than other models for classification
tasks on the dataset collected in this dissertation. |
author2 |
Mao Kezhi |
author_facet |
Mao Kezhi Wei, Zhifeng |
format |
Thesis-Master by Coursework |
author |
Wei, Zhifeng |
author_sort |
Wei, Zhifeng |
title |
Online news analytics based on AI techniques |
title_short |
Online news analytics based on AI techniques |
title_full |
Online news analytics based on AI techniques |
title_fullStr |
Online news analytics based on AI techniques |
title_full_unstemmed |
Online news analytics based on AI techniques |
title_sort |
online news analytics based on ai techniques |
publisher |
Nanyang Technological University |
publishDate |
2022 |
url |
https://hdl.handle.net/10356/156153 |
_version_ |
1772825650933006336 |