Semantic characterisation : knowledge discovery for training set

This paper has proposed the use Latent Semantic Indexing (LSI) to extract semantic information to make the best use of the existing knowledge contained in training sets : Semantic Characterisation (SemC). SemC uses LSI to capture the implicit semantic structure in documents by directly applying cate...

Full description

Saved in:
Bibliographic Details
Main Authors: Tan, Ping Ping, Narayanan, Kulathuramaiyer, Azlina, Ahmadi Julaihi
Format: E-Article
Language:English
Published: International Journal of Innovation, Management and Technology 2013
Subjects:
Online Access:http://ir.unimas.my/id/eprint/47/1/Semantic%20Characterisation%20%28abstract%29.pdf
http://ir.unimas.my/id/eprint/47/
http://ir.unimas.my/47/1/Semantic%20Characterisation%20%28abstract%29.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Malaysia Sarawak
Language: English
id my.unimas.ir.47
record_format eprints
spelling my.unimas.ir.472016-12-27T03:18:42Z http://ir.unimas.my/id/eprint/47/ Semantic characterisation : knowledge discovery for training set Tan, Ping Ping Narayanan, Kulathuramaiyer Azlina, Ahmadi Julaihi Q Science (General) T Technology (General) ZA Information resources This paper has proposed the use Latent Semantic Indexing (LSI) to extract semantic information to make the best use of the existing knowledge contained in training sets : Semantic Characterisation (SemC). SemC uses LSI to capture the implicit semantic structure in documents by directly applying category labels imposed by experts to make semantic structure explicit. The training set filtered by SemC is tested on a supervised automated text categorisation system using Support Vector Machine as classifier. Category by category analysis has shown the ability to bring out the semantic characteristics of the datasets. Even with a reduced training set, SemC is able to overcome the generalisation problem due to its ability to reduce noise within individual categories. Our empirical results also demonstrated that SemC managed to improve categorisation results of heavily overlapping categories. Empirical results also showed that SemC is applicable to a various supervised classifiers. International Journal of Innovation, Management and Technology 2013 E-Article PeerReviewed text en http://ir.unimas.my/id/eprint/47/1/Semantic%20Characterisation%20%28abstract%29.pdf Tan, Ping Ping and Narayanan, Kulathuramaiyer and Azlina, Ahmadi Julaihi (2013) Semantic characterisation : knowledge discovery for training set. International Journal of Innovation, Management and Technology, 4 (1). pp. 59-61. http://ir.unimas.my/47/1/Semantic%20Characterisation%20%28abstract%29.pdf
institution Universiti Malaysia Sarawak
building Centre for Academic Information Services (CAIS)
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Malaysia Sarawak
content_source UNIMAS Institutional Repository
url_provider http://ir.unimas.my/
language English
topic Q Science (General)
T Technology (General)
ZA Information resources
spellingShingle Q Science (General)
T Technology (General)
ZA Information resources
Tan, Ping Ping
Narayanan, Kulathuramaiyer
Azlina, Ahmadi Julaihi
Semantic characterisation : knowledge discovery for training set
description This paper has proposed the use Latent Semantic Indexing (LSI) to extract semantic information to make the best use of the existing knowledge contained in training sets : Semantic Characterisation (SemC). SemC uses LSI to capture the implicit semantic structure in documents by directly applying category labels imposed by experts to make semantic structure explicit. The training set filtered by SemC is tested on a supervised automated text categorisation system using Support Vector Machine as classifier. Category by category analysis has shown the ability to bring out the semantic characteristics of the datasets. Even with a reduced training set, SemC is able to overcome the generalisation problem due to its ability to reduce noise within individual categories. Our empirical results also demonstrated that SemC managed to improve categorisation results of heavily overlapping categories. Empirical results also showed that SemC is applicable to a various supervised classifiers.
format E-Article
author Tan, Ping Ping
Narayanan, Kulathuramaiyer
Azlina, Ahmadi Julaihi
author_facet Tan, Ping Ping
Narayanan, Kulathuramaiyer
Azlina, Ahmadi Julaihi
author_sort Tan, Ping Ping
title Semantic characterisation : knowledge discovery for training set
title_short Semantic characterisation : knowledge discovery for training set
title_full Semantic characterisation : knowledge discovery for training set
title_fullStr Semantic characterisation : knowledge discovery for training set
title_full_unstemmed Semantic characterisation : knowledge discovery for training set
title_sort semantic characterisation : knowledge discovery for training set
publisher International Journal of Innovation, Management and Technology
publishDate 2013
url http://ir.unimas.my/id/eprint/47/1/Semantic%20Characterisation%20%28abstract%29.pdf
http://ir.unimas.my/id/eprint/47/
http://ir.unimas.my/47/1/Semantic%20Characterisation%20%28abstract%29.pdf
_version_ 1644508546524512256