Short text classification

In the information age, short texts are being encountered at numerous instances and in large quantities in the web. The fundamental text mining techniques fail to achieve high accuracy because the short texts are much shorter, nosier and sparser. Hence an efficient way is needed to process and categ...

Full description

Saved in:

Bibliographic Details
Main Author:	Nagarajan, Divya
Other Authors:	Sun Aixin
Format:	Final Year Project
Language:	English
Published:	2013
Subjects:	DRNTU::Engineering
Online Access:	http://hdl.handle.net/10356/52084
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-52084
record_format	dspace
spelling	sg-ntu-dr.10356-520842023-03-03T20:32:48Z Short text classification Nagarajan, Divya Sun Aixin School of Computer Engineering DRNTU::Engineering In the information age, short texts are being encountered at numerous instances and in large quantities in the web. The fundamental text mining techniques fail to achieve high accuracy because the short texts are much shorter, nosier and sparser. Hence an efficient way is needed to process and categorise them so that these information can be used to improve the performance of systems that deal with such data. The aim of this project is to classify a given piece of short text as accurately as possible. Firstly, an existing algorithm was implemented to categorise a given piece of short text. The method first tries to pick up the most representative and topical indicative words from the given short text. These are the query words which would be used while performing the search. From the results retrieved, the category with the majority vote would be chosen as the category label of the given short text. Following this, an enhancement of the above algorithm was done. It was implemented using clustering and relevance ranking. Performance improvements were achieved and the classification accuracy had increased relatively compared to the above mentioned algorithm. Bachelor of Engineering (Computer Science) 2013-04-22T05:21:18Z 2013-04-22T05:21:18Z 2013 2013 Final Year Project (FYP) http://hdl.handle.net/10356/52084 en Nanyang Technological University 33 p. application/pdf
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	DRNTU::Engineering
spellingShingle	DRNTU::Engineering Nagarajan, Divya Short text classification
description	In the information age, short texts are being encountered at numerous instances and in large quantities in the web. The fundamental text mining techniques fail to achieve high accuracy because the short texts are much shorter, nosier and sparser. Hence an efficient way is needed to process and categorise them so that these information can be used to improve the performance of systems that deal with such data. The aim of this project is to classify a given piece of short text as accurately as possible. Firstly, an existing algorithm was implemented to categorise a given piece of short text. The method first tries to pick up the most representative and topical indicative words from the given short text. These are the query words which would be used while performing the search. From the results retrieved, the category with the majority vote would be chosen as the category label of the given short text. Following this, an enhancement of the above algorithm was done. It was implemented using clustering and relevance ranking. Performance improvements were achieved and the classification accuracy had increased relatively compared to the above mentioned algorithm.
author2	Sun Aixin
author_facet	Sun Aixin Nagarajan, Divya
format	Final Year Project
author	Nagarajan, Divya
author_sort	Nagarajan, Divya
title	Short text classification
title_short	Short text classification
title_full	Short text classification
title_fullStr	Short text classification
title_full_unstemmed	Short text classification
title_sort	short text classification
publishDate	2013
url	http://hdl.handle.net/10356/52084
_version_	1759858356927856640

Short text classification

Similar Items