Instance Based Attribute Identification in Database Integration

Most research on attribute identification in database integration has focused on integrating attributes using schema and summary information derived from the attribute values. No research has attempted to fully explore the use of attribute values to perform attribute identification. We propose an at...

Full description

Saved in:
Bibliographic Details
Main Authors: LIM, Ee Peng, CHUA, Cecil, CHIANG, Roger Hsiang-Li
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2003
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/18
https://ink.library.smu.edu.sg/context/sis_research/article/1017/viewcontent/Chua2003_Article_Instance_basedAttributeIdentif.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-1017
record_format dspace
spelling sg-smu-ink.sis_research-10172018-06-19T04:33:12Z Instance Based Attribute Identification in Database Integration LIM, Ee Peng CHUA, Cecil CHIANG, Roger Hsiang-Li Most research on attribute identification in database integration has focused on integrating attributes using schema and summary information derived from the attribute values. No research has attempted to fully explore the use of attribute values to perform attribute identification. We propose an attribute identification method that employs schema and summary instance information as well as properties of attributes derived from their instances. Unlike other attribute identification methods that match only single attributes, our method matches attribute groups for integration. Because our attribute identification method fully explores data instances, it can identify corresponding attributes to be integrated even when schema information is misleading. Three experiments were performed to validate our attribute identification method. In the first experiment, the heuristic rules derived for attribute classification were evaluated on 119 attributes from nine public domain data sets. The second was a controlled experiment validating the robustness of the proposed attribute identification method by introducing erroneous data. The third experiment evaluated the proposed attribute identification method on five data sets extracted from online music stores. The results demonstrated the viability of the proposed method. 2003-01-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/18 info:doi/10.1007/s00778-003-0088-y https://ink.library.smu.edu.sg/context/sis_research/article/1017/viewcontent/Chua2003_Article_Instance_basedAttributeIdentif.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Databases and Information Systems Numerical Analysis and Scientific Computing
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Databases and Information Systems
Numerical Analysis and Scientific Computing
spellingShingle Databases and Information Systems
Numerical Analysis and Scientific Computing
LIM, Ee Peng
CHUA, Cecil
CHIANG, Roger Hsiang-Li
Instance Based Attribute Identification in Database Integration
description Most research on attribute identification in database integration has focused on integrating attributes using schema and summary information derived from the attribute values. No research has attempted to fully explore the use of attribute values to perform attribute identification. We propose an attribute identification method that employs schema and summary instance information as well as properties of attributes derived from their instances. Unlike other attribute identification methods that match only single attributes, our method matches attribute groups for integration. Because our attribute identification method fully explores data instances, it can identify corresponding attributes to be integrated even when schema information is misleading. Three experiments were performed to validate our attribute identification method. In the first experiment, the heuristic rules derived for attribute classification were evaluated on 119 attributes from nine public domain data sets. The second was a controlled experiment validating the robustness of the proposed attribute identification method by introducing erroneous data. The third experiment evaluated the proposed attribute identification method on five data sets extracted from online music stores. The results demonstrated the viability of the proposed method.
format text
author LIM, Ee Peng
CHUA, Cecil
CHIANG, Roger Hsiang-Li
author_facet LIM, Ee Peng
CHUA, Cecil
CHIANG, Roger Hsiang-Li
author_sort LIM, Ee Peng
title Instance Based Attribute Identification in Database Integration
title_short Instance Based Attribute Identification in Database Integration
title_full Instance Based Attribute Identification in Database Integration
title_fullStr Instance Based Attribute Identification in Database Integration
title_full_unstemmed Instance Based Attribute Identification in Database Integration
title_sort instance based attribute identification in database integration
publisher Institutional Knowledge at Singapore Management University
publishDate 2003
url https://ink.library.smu.edu.sg/sis_research/18
https://ink.library.smu.edu.sg/context/sis_research/article/1017/viewcontent/Chua2003_Article_Instance_basedAttributeIdentif.pdf
_version_ 1770568850490785792