Pivot-based Metric Indexing
The general notion of a metric space encompasses a diverse range of data types and accompanying similarity measures. Hence, metric search plays an important role in a wide range of settings, including multimedia retrieval, data mining, and data integration. With the aim of accelerating metric search...
Saved in:
Main Authors: | , , , , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2017
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/3739 https://ink.library.smu.edu.sg/context/sis_research/article/4741/viewcontent/p856_gao.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-4741 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-47412018-11-27T09:12:46Z Pivot-based Metric Indexing CHEN, Lu GAO, Yunjun ZHENG, Baihua JENSEN, Christian S. YANG, Hanyu YANG, Keyu The general notion of a metric space encompasses a diverse range of data types and accompanying similarity measures. Hence, metric search plays an important role in a wide range of settings, including multimedia retrieval, data mining, and data integration. With the aim of accelerating metric search, a collection of pivot-based indexing techniques for metric data has been proposed, which reduces the number of potentially expensive similarity comparisons by exploiting the triangle inequality for pruning and validation. However, no comprehensive empirical study of those techniques exists. Existing studies each offers only a narrower coverage, and they use different pivot selection strategies that affect performance substantially and thus render cross-study comparisons difficult or impossible. We offer a survey of existing pivot-based indexing techniques, and report a comprehensive empirical comparison of their construction costs, update efficiency, storage sizes, and similarity search performance. As part of the study, we provide modifications for two existing indexing techniques to make them more competitive. The findings and insights obtained from the study reveal different strengths and weaknesses of different indexing techniques, and offer guidance on selecting an appropriate indexing technique for a given setting. 2017-08-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/3739 info:doi/10.14778/3115404.3115411 https://ink.library.smu.edu.sg/context/sis_research/article/4741/viewcontent/p856_gao.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Similarity search tree space queries Databases and Information Systems Data Storage Systems |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
Similarity search tree space queries Databases and Information Systems Data Storage Systems |
spellingShingle |
Similarity search tree space queries Databases and Information Systems Data Storage Systems CHEN, Lu GAO, Yunjun ZHENG, Baihua JENSEN, Christian S. YANG, Hanyu YANG, Keyu Pivot-based Metric Indexing |
description |
The general notion of a metric space encompasses a diverse range of data types and accompanying similarity measures. Hence, metric search plays an important role in a wide range of settings, including multimedia retrieval, data mining, and data integration. With the aim of accelerating metric search, a collection of pivot-based indexing techniques for metric data has been proposed, which reduces the number of potentially expensive similarity comparisons by exploiting the triangle inequality for pruning and validation. However, no comprehensive empirical study of those techniques exists. Existing studies each offers only a narrower coverage, and they use different pivot selection strategies that affect performance substantially and thus render cross-study comparisons difficult or impossible. We offer a survey of existing pivot-based indexing techniques, and report a comprehensive empirical comparison of their construction costs, update efficiency, storage sizes, and similarity search performance. As part of the study, we provide modifications for two existing indexing techniques to make them more competitive. The findings and insights obtained from the study reveal different strengths and weaknesses of different indexing techniques, and offer guidance on selecting an appropriate indexing technique for a given setting. |
format |
text |
author |
CHEN, Lu GAO, Yunjun ZHENG, Baihua JENSEN, Christian S. YANG, Hanyu YANG, Keyu |
author_facet |
CHEN, Lu GAO, Yunjun ZHENG, Baihua JENSEN, Christian S. YANG, Hanyu YANG, Keyu |
author_sort |
CHEN, Lu |
title |
Pivot-based Metric Indexing |
title_short |
Pivot-based Metric Indexing |
title_full |
Pivot-based Metric Indexing |
title_fullStr |
Pivot-based Metric Indexing |
title_full_unstemmed |
Pivot-based Metric Indexing |
title_sort |
pivot-based metric indexing |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2017 |
url |
https://ink.library.smu.edu.sg/sis_research/3739 https://ink.library.smu.edu.sg/context/sis_research/article/4741/viewcontent/p856_gao.pdf |
_version_ |
1770573707320754176 |