Anatomy of a coupling query in a web warehouse

To populate a data warehouse specifically designed for Web data, i.e. web warehouse, it is imperative to harness relevant documents from the Web. In this paper, we describe a query mechanism called coupling query to glean relevant Web data in the context of our web warehousing system called Warehous...

Full description

Saved in:
Bibliographic Details
Main Authors: BHOWMICK, Sourav S., MADRIA, Sanjay Kumar, NG, Wee-Keong, LIM, Ee Peng
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2002
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/66
https://ink.library.smu.edu.sg/context/sis_research/article/1065/viewcontent/1_s2.0_S0950584902000514_main.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:To populate a data warehouse specifically designed for Web data, i.e. web warehouse, it is imperative to harness relevant documents from the Web. In this paper, we describe a query mechanism called coupling query to glean relevant Web data in the context of our web warehousing system called Warehouse Of Web Data (WHOWEDA). Coupling query may be used for querying both HTML and XML documents. Some of the important features of our query mechanism are ability to query metadata, content, internal and external (hyperlink) structure of Web documents based on partial knowledge, ability to express constraints on tag attributes and tagless segment of data, ability to express conjunctive as well as disjunctive query conditions compactly, ability to control execution of a web query and preservation of the topological structure of hyperlinked documents in the query results. We also discuss how to formulate query graphically and in textual form using coupling graph and coupling text, respectively.