Web classification using Support Vector Machine

In web classification, web pages from one or more web sites are assigned to pre-defined categories according to their content. Since web pages are more than just plain text documents, web classification methods have to consider using other context features of web pages, such as hyperlinks and HTML t...

Full description

Saved in:
Bibliographic Details
Main Authors: SUN, Aixin, LIM, Ee Peng
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2002
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/969
https://ink.library.smu.edu.sg/context/sis_research/article/1968/viewcontent/p96_sun.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-1968
record_format dspace
spelling sg-smu-ink.sis_research-19682018-06-20T05:55:14Z Web classification using Support Vector Machine SUN, Aixin LIM, Ee Peng In web classification, web pages from one or more web sites are assigned to pre-defined categories according to their content. Since web pages are more than just plain text documents, web classification methods have to consider using other context features of web pages, such as hyperlinks and HTML tags. In this paper, we propose the use of Support Vector Machine (SVM) classifiers to classify web pages using both their text and context feature sets. We have experimented our web classification method on the WebKB data set. Compared with earlier Foil-Pilfs method on the same data set, our method has been shown to perform very well. We have also shown that the use of context features especially hyperlinks can improve the classification performance significantly. 2002-11-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/969 info:doi/10.1145/584931.584952 https://ink.library.smu.edu.sg/context/sis_research/article/1968/viewcontent/p96_sun.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Databases and Information Systems Numerical Analysis and Scientific Computing
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Databases and Information Systems
Numerical Analysis and Scientific Computing
spellingShingle Databases and Information Systems
Numerical Analysis and Scientific Computing
SUN, Aixin
LIM, Ee Peng
Web classification using Support Vector Machine
description In web classification, web pages from one or more web sites are assigned to pre-defined categories according to their content. Since web pages are more than just plain text documents, web classification methods have to consider using other context features of web pages, such as hyperlinks and HTML tags. In this paper, we propose the use of Support Vector Machine (SVM) classifiers to classify web pages using both their text and context feature sets. We have experimented our web classification method on the WebKB data set. Compared with earlier Foil-Pilfs method on the same data set, our method has been shown to perform very well. We have also shown that the use of context features especially hyperlinks can improve the classification performance significantly.
format text
author SUN, Aixin
LIM, Ee Peng
author_facet SUN, Aixin
LIM, Ee Peng
author_sort SUN, Aixin
title Web classification using Support Vector Machine
title_short Web classification using Support Vector Machine
title_full Web classification using Support Vector Machine
title_fullStr Web classification using Support Vector Machine
title_full_unstemmed Web classification using Support Vector Machine
title_sort web classification using support vector machine
publisher Institutional Knowledge at Singapore Management University
publishDate 2002
url https://ink.library.smu.edu.sg/sis_research/969
https://ink.library.smu.edu.sg/context/sis_research/article/1968/viewcontent/p96_sun.pdf
_version_ 1770570808428593152