Instance-level integration, query processing and optimization in Federated Database Systems

This thesis addresses the instance-level integration, query processing and optimization problems in a federated database environment in which the heterogeneity among component databases has to be resolved and the autonomy of component database systems has to be preserved. The main contribution of th...

Full description

Saved in:
Bibliographic Details
Main Author: LIM, Ee Peng
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 1994
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/992
http://proquest.umi.com/pqdweb?did=746817631&sid=1&Fmt=2&clientId=44274&RQT=309&VName=PQD
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-1991
record_format dspace
spelling sg-smu-ink.sis_research-19912018-06-22T04:43:52Z Instance-level integration, query processing and optimization in Federated Database Systems LIM, Ee Peng This thesis addresses the instance-level integration, query processing and optimization problems in a federated database environment in which the heterogeneity among component databases has to be resolved and the autonomy of component database systems has to be preserved. The main contribution of this thesis is to define entity identification and attribute value conflict as two instance-level integration problems arising from the heterogeneity of local databases and to propose solutions to them. The objective of entity identification is to match object instances from different databases which correspond to the same real-world entity. Attribute value conflict arises when the attribute values in the two databases, modeling the same property of a real-world entity, do not match. The thesis also addresses the federated query processing and optimization problem in the context of heterogeneity and autonomy. Federated query optimization is concerned with producing an efficient execution plan for a query over a virtually integrated database. In this research, we propose a two-step entity identification process which separates the derivation of identifying attributes and the matching of object instances. Reasoning techniques based on definite logic and indefinite logic are adopted by the two-step process. In the context of attribute value conflict resolution, we are concerned with resolving values that contain uncertainties. We propose an extended relational model based on the Dempster-Shafer theory of evidence to incorporate such uncertain knowledge about the source databases. The closure and boundedness properties of our proposed extended operations are formulated. In the context of federated query processing and optimization, we propose a set of integration operations that are useful in resolving instance-level conflicts. We develop an algebraic transformation framework which involves both the existing relational operations and the integration operations. This framework is subsequently used for optimizing the federated database queries based on our proposed query processing architecture. We have also implemented the definite logic approach to entity identification problem, and the evidential reasoning approach to resolving conflicting attribute values. The algorithms for performing the proposed integration operations, and the algorithms for federated query processing have been realized in the Myriad project--a federated database prototype. 1994-01-01T08:00:00Z text https://ink.library.smu.edu.sg/sis_research/992 http://proquest.umi.com/pqdweb?did=746817631&sid=1&Fmt=2&clientId=44274&RQT=309&VName=PQD Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Databases and Information Systems
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Databases and Information Systems
spellingShingle Databases and Information Systems
LIM, Ee Peng
Instance-level integration, query processing and optimization in Federated Database Systems
description This thesis addresses the instance-level integration, query processing and optimization problems in a federated database environment in which the heterogeneity among component databases has to be resolved and the autonomy of component database systems has to be preserved. The main contribution of this thesis is to define entity identification and attribute value conflict as two instance-level integration problems arising from the heterogeneity of local databases and to propose solutions to them. The objective of entity identification is to match object instances from different databases which correspond to the same real-world entity. Attribute value conflict arises when the attribute values in the two databases, modeling the same property of a real-world entity, do not match. The thesis also addresses the federated query processing and optimization problem in the context of heterogeneity and autonomy. Federated query optimization is concerned with producing an efficient execution plan for a query over a virtually integrated database. In this research, we propose a two-step entity identification process which separates the derivation of identifying attributes and the matching of object instances. Reasoning techniques based on definite logic and indefinite logic are adopted by the two-step process. In the context of attribute value conflict resolution, we are concerned with resolving values that contain uncertainties. We propose an extended relational model based on the Dempster-Shafer theory of evidence to incorporate such uncertain knowledge about the source databases. The closure and boundedness properties of our proposed extended operations are formulated. In the context of federated query processing and optimization, we propose a set of integration operations that are useful in resolving instance-level conflicts. We develop an algebraic transformation framework which involves both the existing relational operations and the integration operations. This framework is subsequently used for optimizing the federated database queries based on our proposed query processing architecture. We have also implemented the definite logic approach to entity identification problem, and the evidential reasoning approach to resolving conflicting attribute values. The algorithms for performing the proposed integration operations, and the algorithms for federated query processing have been realized in the Myriad project--a federated database prototype.
format text
author LIM, Ee Peng
author_facet LIM, Ee Peng
author_sort LIM, Ee Peng
title Instance-level integration, query processing and optimization in Federated Database Systems
title_short Instance-level integration, query processing and optimization in Federated Database Systems
title_full Instance-level integration, query processing and optimization in Federated Database Systems
title_fullStr Instance-level integration, query processing and optimization in Federated Database Systems
title_full_unstemmed Instance-level integration, query processing and optimization in Federated Database Systems
title_sort instance-level integration, query processing and optimization in federated database systems
publisher Institutional Knowledge at Singapore Management University
publishDate 1994
url https://ink.library.smu.edu.sg/sis_research/992
http://proquest.umi.com/pqdweb?did=746817631&sid=1&Fmt=2&clientId=44274&RQT=309&VName=PQD
_version_ 1770570816719683584