Web Unit Based Mining of Homepage Relationships
Homepages usually describe important semantic information about conceptual or physical entities; hence, they are the main targets for searching and browsing. To facilitate semantic-based information retrieval (IR) at a Web site, homepages can be identified and classified under some predefined concep...
Saved in:
Main Authors: | , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2006
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/201 http://dx.doi.org/10.1002/asi.20279 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-1200 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-12002010-09-22T14:00:36Z Web Unit Based Mining of Homepage Relationships SUN, Aixin LIM, Ee Peng Homepages usually describe important semantic information about conceptual or physical entities; hence, they are the main targets for searching and browsing. To facilitate semantic-based information retrieval (IR) at a Web site, homepages can be identified and classified under some predefined concepts and these concepts are then used in query or browsing criteria, e.g., finding professor homepages containing information retrieval. In some Web sites, relationships may also exist among homepages. These relationship instances (also known as homepage relationships) enrich our knowledge about these Web sites and allow more expressive semantic-based IR. In this article, we investigate the features to be used in mining homepage relationships. We systematically develop different classes of inter-homepage features, namely, navigation, relative-location, and common-item features. We also propose deriving for each homepage a set of support pages to obtain richer and more complete content about the entity described by the homepage. The homepage together with its support pages are known to be a Web unit. By extracting inter-homepage features from Web units, our experiments on the WebKB dataset show that better homepage relationship mining accuracies can be achieved. [PUBLICATION ABSTRACT] 2006-02-01T08:00:00Z text https://ink.library.smu.edu.sg/sis_research/201 info:doi/10.1002/asi.20279 http://dx.doi.org/10.1002/asi.20279 Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Databases and Information Systems Numerical Analysis and Scientific Computing |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
Databases and Information Systems Numerical Analysis and Scientific Computing |
spellingShingle |
Databases and Information Systems Numerical Analysis and Scientific Computing SUN, Aixin LIM, Ee Peng Web Unit Based Mining of Homepage Relationships |
description |
Homepages usually describe important semantic information about conceptual or physical entities; hence, they are the main targets for searching and browsing. To facilitate semantic-based information retrieval (IR) at a Web site, homepages can be identified and classified under some predefined concepts and these concepts are then used in query or browsing criteria, e.g., finding professor homepages containing information retrieval. In some Web sites, relationships may also exist among homepages. These relationship instances (also known as homepage relationships) enrich our knowledge about these Web sites and allow more expressive semantic-based IR. In this article, we investigate the features to be used in mining homepage relationships. We systematically develop different classes of inter-homepage features, namely, navigation, relative-location, and common-item features. We also propose deriving for each homepage a set of support pages to obtain richer and more complete content about the entity described by the homepage. The homepage together with its support pages are known to be a Web unit. By extracting inter-homepage features from Web units, our experiments on the WebKB dataset show that better homepage relationship mining accuracies can be achieved. [PUBLICATION ABSTRACT] |
format |
text |
author |
SUN, Aixin LIM, Ee Peng |
author_facet |
SUN, Aixin LIM, Ee Peng |
author_sort |
SUN, Aixin |
title |
Web Unit Based Mining of Homepage Relationships |
title_short |
Web Unit Based Mining of Homepage Relationships |
title_full |
Web Unit Based Mining of Homepage Relationships |
title_fullStr |
Web Unit Based Mining of Homepage Relationships |
title_full_unstemmed |
Web Unit Based Mining of Homepage Relationships |
title_sort |
web unit based mining of homepage relationships |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2006 |
url |
https://ink.library.smu.edu.sg/sis_research/201 http://dx.doi.org/10.1002/asi.20279 |
_version_ |
1770568918422781952 |