Data model for warehousing historical web information

In this paper, we present a temporal web data model designed for warehousing historical data from World Wide Web (WWW). As the Web is now populated with large volume of information, it has become necessary to capture selected portions of web information in a data warehouse that supports further info...

Full description

Saved in:
Bibliographic Details
Main Authors: LIM, Ee Peng, CAO, Yinyan, NG, Wee-Keong
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2003
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/67
http://doi.org/10.1016/S0950-5849(03)00019-3
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-1066
record_format dspace
spelling sg-smu-ink.sis_research-10662018-06-22T03:52:38Z Data model for warehousing historical web information LIM, Ee Peng CAO, Yinyan NG, Wee-Keong In this paper, we present a temporal web data model designed for warehousing historical data from World Wide Web (WWW). As the Web is now populated with large volume of information, it has become necessary to capture selected portions of web information in a data warehouse that supports further information processing such as data extraction, data classification, and data mining. Nevertheless, due to the unstructured and dynamic nature of Web, the traditional relational model and its temporal variants could not be used to build such a data warehouse. In this paper, we therefore propose a temporal web data model that represents web documents and their connectivities in the form of temporal web tables. To represent web data that evolve with time, a visible time interval is associated with each web document. To manipulate temporal web tables, we have defined a set of web operators with capabilities ranging from extracting WWW information into web tables, to merging information from different web tables. We further illustrate the use of our temporal web data model using some realistic motivating examples. 2003-03-01T08:00:00Z text https://ink.library.smu.edu.sg/sis_research/67 info:doi/10.1016/S0950-5849(03)00019-3 http://doi.org/10.1016/S0950-5849(03)00019-3 Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Databases and Information Systems Numerical Analysis and Scientific Computing
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Databases and Information Systems
Numerical Analysis and Scientific Computing
spellingShingle Databases and Information Systems
Numerical Analysis and Scientific Computing
LIM, Ee Peng
CAO, Yinyan
NG, Wee-Keong
Data model for warehousing historical web information
description In this paper, we present a temporal web data model designed for warehousing historical data from World Wide Web (WWW). As the Web is now populated with large volume of information, it has become necessary to capture selected portions of web information in a data warehouse that supports further information processing such as data extraction, data classification, and data mining. Nevertheless, due to the unstructured and dynamic nature of Web, the traditional relational model and its temporal variants could not be used to build such a data warehouse. In this paper, we therefore propose a temporal web data model that represents web documents and their connectivities in the form of temporal web tables. To represent web data that evolve with time, a visible time interval is associated with each web document. To manipulate temporal web tables, we have defined a set of web operators with capabilities ranging from extracting WWW information into web tables, to merging information from different web tables. We further illustrate the use of our temporal web data model using some realistic motivating examples.
format text
author LIM, Ee Peng
CAO, Yinyan
NG, Wee-Keong
author_facet LIM, Ee Peng
CAO, Yinyan
NG, Wee-Keong
author_sort LIM, Ee Peng
title Data model for warehousing historical web information
title_short Data model for warehousing historical web information
title_full Data model for warehousing historical web information
title_fullStr Data model for warehousing historical web information
title_full_unstemmed Data model for warehousing historical web information
title_sort data model for warehousing historical web information
publisher Institutional Knowledge at Singapore Management University
publishDate 2003
url https://ink.library.smu.edu.sg/sis_research/67
http://doi.org/10.1016/S0950-5849(03)00019-3
_version_ 1770568871069089792