Feature-Based Similarity Search in Graph Structures

Similarity search of complex structures is an important operation in graph-related applications since exact matching is often too restrictive. In this article, we investigate the issues of substructure similarity search using indexed features in graph databases. By transforming the edge relaxation r...

Full description

Saved in:
Bibliographic Details
Main Authors: YAN, Xifeng, ZHU, Feida, YU, Philip S., HAN, Jiawei
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2006
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/131
http://dx.doi.org/10.1145/1189769.1189777
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-1130
record_format dspace
spelling sg-smu-ink.sis_research-11302010-09-22T14:00:36Z Feature-Based Similarity Search in Graph Structures YAN, Xifeng ZHU, Feida YU, Philip S. HAN, Jiawei Similarity search of complex structures is an important operation in graph-related applications since exact matching is often too restrictive. In this article, we investigate the issues of substructure similarity search using indexed features in graph databases. By transforming the edge relaxation ratio of a query graph into the maximum allowed feature misses, our structural filtering algorithm can filter graphs without performing pairwise similarity computation. It is further shown that using either too few or too many features can result in poor filtering performance. Thus the challenge is to design an effective feature set selection strategy that could maximize the filtering capability. We prove that the complexity of optimal feature set selection is ?(2m) in the worst case, where m is the number of features for selection. In practice, we identify several criteria to build effective feature sets for filtering, and demonstrate that combining features with similar size and selectivity can improve the filtering and search performance significantly within a multifilter composition framework. The proposed feature-based filtering concept can be generalized and applied to searching approximate nonconsecutive sequences, trees, and other structured data as well. 2006-12-01T08:00:00Z text https://ink.library.smu.edu.sg/sis_research/131 info:doi/10.1145/1189769.1189777 http://dx.doi.org/10.1145/1189769.1189777 Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Databases and Information Systems Numerical Analysis and Scientific Computing
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Databases and Information Systems
Numerical Analysis and Scientific Computing
spellingShingle Databases and Information Systems
Numerical Analysis and Scientific Computing
YAN, Xifeng
ZHU, Feida
YU, Philip S.
HAN, Jiawei
Feature-Based Similarity Search in Graph Structures
description Similarity search of complex structures is an important operation in graph-related applications since exact matching is often too restrictive. In this article, we investigate the issues of substructure similarity search using indexed features in graph databases. By transforming the edge relaxation ratio of a query graph into the maximum allowed feature misses, our structural filtering algorithm can filter graphs without performing pairwise similarity computation. It is further shown that using either too few or too many features can result in poor filtering performance. Thus the challenge is to design an effective feature set selection strategy that could maximize the filtering capability. We prove that the complexity of optimal feature set selection is ?(2m) in the worst case, where m is the number of features for selection. In practice, we identify several criteria to build effective feature sets for filtering, and demonstrate that combining features with similar size and selectivity can improve the filtering and search performance significantly within a multifilter composition framework. The proposed feature-based filtering concept can be generalized and applied to searching approximate nonconsecutive sequences, trees, and other structured data as well.
format text
author YAN, Xifeng
ZHU, Feida
YU, Philip S.
HAN, Jiawei
author_facet YAN, Xifeng
ZHU, Feida
YU, Philip S.
HAN, Jiawei
author_sort YAN, Xifeng
title Feature-Based Similarity Search in Graph Structures
title_short Feature-Based Similarity Search in Graph Structures
title_full Feature-Based Similarity Search in Graph Structures
title_fullStr Feature-Based Similarity Search in Graph Structures
title_full_unstemmed Feature-Based Similarity Search in Graph Structures
title_sort feature-based similarity search in graph structures
publisher Institutional Knowledge at Singapore Management University
publishDate 2006
url https://ink.library.smu.edu.sg/sis_research/131
http://dx.doi.org/10.1145/1189769.1189777
_version_ 1770568881012736000