Text classification using topic modelling and machine learning

This report presents a study that has been centered on topic modelling and text classification through the development and evaluation of a self-developed Latent Dirichlet Allocation model. In this project, we leveraged machine learning techniques to evaluate the effect of incorporating various prior...

Full description

Saved in:
Bibliographic Details
Main Author: Li, Xinyu
Other Authors: S Supraja
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/176723
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-176723
record_format dspace
spelling sg-ntu-dr.10356-1767232024-05-24T15:50:40Z Text classification using topic modelling and machine learning Li, Xinyu S Supraja School of Electrical and Electronic Engineering supraja.s@ntu.edu.sg Computer and Information Science This report presents a study that has been centered on topic modelling and text classification through the development and evaluation of a self-developed Latent Dirichlet Allocation model. In this project, we leveraged machine learning techniques to evaluate the effect of incorporating various prior types within the developed LDA model. The experiments to evaluate the performance of the developed model were conducted across three benchmark datasets: 20 Newsgroups, Neural Network Patent Query, and New York Times News Articles, and its performance was assessed based on classification reports generated by Support Vector Machines (SVMs), Extreme Learning Machines (ELM), and Gaussian Processes (GP) classifiers. The classification results demonstrate a clear correlation between the choice of alpha and beta prior types and the quality of topics modelled. The results highlight the potential of the custom prior settings to enhance both topic discovery and classification effectiveness. This study contributes to the domains of topic modelling and text classification, illustrating the practical applicability of advanced topic modeling techniques for enhancing text classification results, and setting the stage for future research into the optimization of topic models for diverse analytical tasks. Bachelor's degree 2024-05-20T01:38:45Z 2024-05-20T01:38:45Z 2024 Final Year Project (FYP) Li, X. (2024). Text classification using topic modelling and machine learning. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/176723 https://hdl.handle.net/10356/176723 en application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Computer and Information Science
spellingShingle Computer and Information Science
Li, Xinyu
Text classification using topic modelling and machine learning
description This report presents a study that has been centered on topic modelling and text classification through the development and evaluation of a self-developed Latent Dirichlet Allocation model. In this project, we leveraged machine learning techniques to evaluate the effect of incorporating various prior types within the developed LDA model. The experiments to evaluate the performance of the developed model were conducted across three benchmark datasets: 20 Newsgroups, Neural Network Patent Query, and New York Times News Articles, and its performance was assessed based on classification reports generated by Support Vector Machines (SVMs), Extreme Learning Machines (ELM), and Gaussian Processes (GP) classifiers. The classification results demonstrate a clear correlation between the choice of alpha and beta prior types and the quality of topics modelled. The results highlight the potential of the custom prior settings to enhance both topic discovery and classification effectiveness. This study contributes to the domains of topic modelling and text classification, illustrating the practical applicability of advanced topic modeling techniques for enhancing text classification results, and setting the stage for future research into the optimization of topic models for diverse analytical tasks.
author2 S Supraja
author_facet S Supraja
Li, Xinyu
format Final Year Project
author Li, Xinyu
author_sort Li, Xinyu
title Text classification using topic modelling and machine learning
title_short Text classification using topic modelling and machine learning
title_full Text classification using topic modelling and machine learning
title_fullStr Text classification using topic modelling and machine learning
title_full_unstemmed Text classification using topic modelling and machine learning
title_sort text classification using topic modelling and machine learning
publisher Nanyang Technological University
publishDate 2024
url https://hdl.handle.net/10356/176723
_version_ 1800916407418880000