Pivot-based Metric Indexing

The general notion of a metric space encompasses a diverse range of data types and accompanying similarity measures. Hence, metric search plays an important role in a wide range of settings, including multimedia retrieval, data mining, and data integration. With the aim of accelerating metric search...

Full description

Saved in:
Bibliographic Details
Main Authors: CHEN, Lu, GAO, Yunjun, ZHENG, Baihua, JENSEN, Christian S., YANG, Hanyu, YANG, Keyu
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2017
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/3739
https://ink.library.smu.edu.sg/context/sis_research/article/4741/viewcontent/p856_gao.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-4741
record_format dspace
spelling sg-smu-ink.sis_research-47412018-11-27T09:12:46Z Pivot-based Metric Indexing CHEN, Lu GAO, Yunjun ZHENG, Baihua JENSEN, Christian S. YANG, Hanyu YANG, Keyu The general notion of a metric space encompasses a diverse range of data types and accompanying similarity measures. Hence, metric search plays an important role in a wide range of settings, including multimedia retrieval, data mining, and data integration. With the aim of accelerating metric search, a collection of pivot-based indexing techniques for metric data has been proposed, which reduces the number of potentially expensive similarity comparisons by exploiting the triangle inequality for pruning and validation. However, no comprehensive empirical study of those techniques exists. Existing studies each offers only a narrower coverage, and they use different pivot selection strategies that affect performance substantially and thus render cross-study comparisons difficult or impossible. We offer a survey of existing pivot-based indexing techniques, and report a comprehensive empirical comparison of their construction costs, update efficiency, storage sizes, and similarity search performance. As part of the study, we provide modifications for two existing indexing techniques to make them more competitive. The findings and insights obtained from the study reveal different strengths and weaknesses of different indexing techniques, and offer guidance on selecting an appropriate indexing technique for a given setting. 2017-08-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/3739 info:doi/10.14778/3115404.3115411 https://ink.library.smu.edu.sg/context/sis_research/article/4741/viewcontent/p856_gao.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Similarity search tree space queries Databases and Information Systems Data Storage Systems
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Similarity search
tree
space
queries
Databases and Information Systems
Data Storage Systems
spellingShingle Similarity search
tree
space
queries
Databases and Information Systems
Data Storage Systems
CHEN, Lu
GAO, Yunjun
ZHENG, Baihua
JENSEN, Christian S.
YANG, Hanyu
YANG, Keyu
Pivot-based Metric Indexing
description The general notion of a metric space encompasses a diverse range of data types and accompanying similarity measures. Hence, metric search plays an important role in a wide range of settings, including multimedia retrieval, data mining, and data integration. With the aim of accelerating metric search, a collection of pivot-based indexing techniques for metric data has been proposed, which reduces the number of potentially expensive similarity comparisons by exploiting the triangle inequality for pruning and validation. However, no comprehensive empirical study of those techniques exists. Existing studies each offers only a narrower coverage, and they use different pivot selection strategies that affect performance substantially and thus render cross-study comparisons difficult or impossible. We offer a survey of existing pivot-based indexing techniques, and report a comprehensive empirical comparison of their construction costs, update efficiency, storage sizes, and similarity search performance. As part of the study, we provide modifications for two existing indexing techniques to make them more competitive. The findings and insights obtained from the study reveal different strengths and weaknesses of different indexing techniques, and offer guidance on selecting an appropriate indexing technique for a given setting.
format text
author CHEN, Lu
GAO, Yunjun
ZHENG, Baihua
JENSEN, Christian S.
YANG, Hanyu
YANG, Keyu
author_facet CHEN, Lu
GAO, Yunjun
ZHENG, Baihua
JENSEN, Christian S.
YANG, Hanyu
YANG, Keyu
author_sort CHEN, Lu
title Pivot-based Metric Indexing
title_short Pivot-based Metric Indexing
title_full Pivot-based Metric Indexing
title_fullStr Pivot-based Metric Indexing
title_full_unstemmed Pivot-based Metric Indexing
title_sort pivot-based metric indexing
publisher Institutional Knowledge at Singapore Management University
publishDate 2017
url https://ink.library.smu.edu.sg/sis_research/3739
https://ink.library.smu.edu.sg/context/sis_research/article/4741/viewcontent/p856_gao.pdf
_version_ 1770573707320754176