LSI-based semantic characterisation for automated text categorisation
As knowledge acquisition remains a bottleneck, incorporating human judgement within intelligent systems is still a challenge. Supervised learning methods have shown to be able to assist humans in automated text categorization (ATC). However, the performance of such systems is largely dependent on th...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Language: | English |
Published: |
Faculty of Computer Science and Information Technology
2009
|
Subjects: | |
Online Access: | http://ir.unimas.my/id/eprint/167/8/LSI-based%20semantic%20characterization%20for%20automated%20text%20categorization%20%28fulltext%29.pdf http://ir.unimas.my/id/eprint/167/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Universiti Malaysia Sarawak |
Language: | English |
id |
my.unimas.ir.167 |
---|---|
record_format |
eprints |
spelling |
my.unimas.ir.1672023-05-08T07:37:46Z http://ir.unimas.my/id/eprint/167/ LSI-based semantic characterisation for automated text categorisation Tan, Ping Ping QA76 Computer software As knowledge acquisition remains a bottleneck, incorporating human judgement within intelligent systems is still a challenge. Supervised learning methods have shown to be able to assist humans in automated text categorization (ATC). However, the performance of such systems is largely dependent on the characteristics of the datasets. Without the understanding of why a classifier works well for certain datasets, it is difficult to generalise its application across domains. Furthermore, most training sets used in supervised ATC have category labels provided by human experts. Expert knowledge used in the task of categorization is often not captured via the mere process of manipulating category labels. This has resulted in lose of intended meanings while performing supervised ATC. Besides that, large text datasets often contain a greater deal of noise. Faculty of Computer Science and Information Technology 2009 Thesis NonPeerReviewed text en http://ir.unimas.my/id/eprint/167/8/LSI-based%20semantic%20characterization%20for%20automated%20text%20categorization%20%28fulltext%29.pdf Tan, Ping Ping (2009) LSI-based semantic characterisation for automated text categorisation. Masters thesis, Universiti Malaysia Sarawak. |
institution |
Universiti Malaysia Sarawak |
building |
Centre for Academic Information Services (CAIS) |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Malaysia Sarawak |
content_source |
UNIMAS Institutional Repository |
url_provider |
http://ir.unimas.my/ |
language |
English |
topic |
QA76 Computer software |
spellingShingle |
QA76 Computer software Tan, Ping Ping LSI-based semantic characterisation for automated text categorisation |
description |
As knowledge acquisition remains a bottleneck, incorporating human judgement within intelligent systems is still a challenge. Supervised learning methods have shown to be able to assist humans in automated text categorization (ATC). However, the performance of such systems is largely dependent on the characteristics of the datasets. Without the understanding of why a classifier works well for certain datasets, it is difficult to generalise its application across domains. Furthermore, most training sets used in supervised ATC have category labels provided by human experts. Expert knowledge used in the task of categorization is often not captured via the mere process of manipulating category labels. This has resulted in lose of intended meanings while performing supervised ATC. Besides that, large text datasets often contain a greater deal of noise. |
format |
Thesis |
author |
Tan, Ping Ping |
author_facet |
Tan, Ping Ping |
author_sort |
Tan, Ping Ping |
title |
LSI-based semantic characterisation for automated text categorisation |
title_short |
LSI-based semantic characterisation for automated text categorisation |
title_full |
LSI-based semantic characterisation for automated text categorisation |
title_fullStr |
LSI-based semantic characterisation for automated text categorisation |
title_full_unstemmed |
LSI-based semantic characterisation for automated text categorisation |
title_sort |
lsi-based semantic characterisation for automated text categorisation |
publisher |
Faculty of Computer Science and Information Technology |
publishDate |
2009 |
url |
http://ir.unimas.my/id/eprint/167/8/LSI-based%20semantic%20characterization%20for%20automated%20text%20categorization%20%28fulltext%29.pdf http://ir.unimas.my/id/eprint/167/ |
_version_ |
1767209756550758400 |