Text classification using topic modelling and machine learning
This report presents a study that has been centered on topic modelling and text classification through the development and evaluation of a self-developed Latent Dirichlet Allocation model. In this project, we leveraged machine learning techniques to evaluate the effect of incorporating various prior...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/176723 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-176723 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1767232024-05-24T15:50:40Z Text classification using topic modelling and machine learning Li, Xinyu S Supraja School of Electrical and Electronic Engineering supraja.s@ntu.edu.sg Computer and Information Science This report presents a study that has been centered on topic modelling and text classification through the development and evaluation of a self-developed Latent Dirichlet Allocation model. In this project, we leveraged machine learning techniques to evaluate the effect of incorporating various prior types within the developed LDA model. The experiments to evaluate the performance of the developed model were conducted across three benchmark datasets: 20 Newsgroups, Neural Network Patent Query, and New York Times News Articles, and its performance was assessed based on classification reports generated by Support Vector Machines (SVMs), Extreme Learning Machines (ELM), and Gaussian Processes (GP) classifiers. The classification results demonstrate a clear correlation between the choice of alpha and beta prior types and the quality of topics modelled. The results highlight the potential of the custom prior settings to enhance both topic discovery and classification effectiveness. This study contributes to the domains of topic modelling and text classification, illustrating the practical applicability of advanced topic modeling techniques for enhancing text classification results, and setting the stage for future research into the optimization of topic models for diverse analytical tasks. Bachelor's degree 2024-05-20T01:38:45Z 2024-05-20T01:38:45Z 2024 Final Year Project (FYP) Li, X. (2024). Text classification using topic modelling and machine learning. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/176723 https://hdl.handle.net/10356/176723 en application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Computer and Information Science |
spellingShingle |
Computer and Information Science Li, Xinyu Text classification using topic modelling and machine learning |
description |
This report presents a study that has been centered on topic modelling and text classification through the development and evaluation of a self-developed Latent Dirichlet Allocation model. In this project, we leveraged machine learning techniques to evaluate the effect of incorporating various prior types within the developed LDA model.
The experiments to evaluate the performance of the developed model were conducted across three benchmark datasets: 20 Newsgroups, Neural Network Patent Query, and New York Times News Articles, and its performance was assessed based on classification reports generated by Support Vector Machines (SVMs), Extreme Learning Machines (ELM), and Gaussian Processes (GP) classifiers.
The classification results demonstrate a clear correlation between the choice of alpha and beta prior types and the quality of topics modelled. The results highlight the potential of the custom prior settings to enhance both topic discovery and classification effectiveness. This study contributes to the domains of topic modelling and text classification, illustrating the practical applicability of advanced topic modeling techniques for enhancing text classification results, and setting the stage for future research into the optimization of topic models for diverse analytical tasks. |
author2 |
S Supraja |
author_facet |
S Supraja Li, Xinyu |
format |
Final Year Project |
author |
Li, Xinyu |
author_sort |
Li, Xinyu |
title |
Text classification using topic modelling and machine learning |
title_short |
Text classification using topic modelling and machine learning |
title_full |
Text classification using topic modelling and machine learning |
title_fullStr |
Text classification using topic modelling and machine learning |
title_full_unstemmed |
Text classification using topic modelling and machine learning |
title_sort |
text classification using topic modelling and machine learning |
publisher |
Nanyang Technological University |
publishDate |
2024 |
url |
https://hdl.handle.net/10356/176723 |
_version_ |
1800916407418880000 |