Exploring bit-difference for approximate KNN search in high-dimensional databases
In this paper, we develop a novel index structure to support effcient approximate k-nearest neighbor (KNN) query in high-dimensional databases. In high-dimensional spaces, the computational cost of the distance (e.g., Euclidean distance) between two points contributes a dominant portion of the overa...
Saved in:
Main Authors: | , , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2005
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/1298 https://ink.library.smu.edu.sg/context/sis_research/article/2297/viewcontent/CRPITV39CuiShen.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-2297 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-22972015-01-10T10:53:54Z Exploring bit-difference for approximate KNN search in high-dimensional databases Cui, Bin Shen, Heng Tao SHEN, Jialie Tan, Kian-Lee In this paper, we develop a novel index structure to support effcient approximate k-nearest neighbor (KNN) query in high-dimensional databases. In high-dimensional spaces, the computational cost of the distance (e.g., Euclidean distance) between two points contributes a dominant portion of the overall query response time for memory processing. To reduce the distance computation, we first propose a structure (BID) using BIt-Difference to answer approximate KNN query. The BID employs one bit to represent each feature vector of point and the number of bit-difference is used to prune the further points. To facilitate real dataset which is typically skewed, we enhance the BID mechanism with clustering, cluster adapted bitcoder and dimensional weight, named the BID+. Extensive experiments are conducted to show that our proposed method yields signifcant performance advantages over the existing index structures on both real life and synthetic high-dimensional datasets. 2005-01-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/1298 https://ink.library.smu.edu.sg/context/sis_research/article/2297/viewcontent/CRPITV39CuiShen.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University High-dimensional index structure approximate KNN query memory processing bit difference Databases and Information Systems |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
High-dimensional index structure approximate KNN query memory processing bit difference Databases and Information Systems |
spellingShingle |
High-dimensional index structure approximate KNN query memory processing bit difference Databases and Information Systems Cui, Bin Shen, Heng Tao SHEN, Jialie Tan, Kian-Lee Exploring bit-difference for approximate KNN search in high-dimensional databases |
description |
In this paper, we develop a novel index structure to support effcient approximate k-nearest neighbor (KNN) query in high-dimensional databases. In high-dimensional spaces, the computational cost of the distance (e.g., Euclidean distance) between two points contributes a dominant portion of the overall query response time for memory processing. To reduce the distance computation, we first propose a structure (BID) using BIt-Difference to answer approximate KNN query. The BID employs one bit to represent each feature vector of point and the number of bit-difference is used to prune the further points. To facilitate real dataset which is typically skewed, we enhance the BID mechanism with clustering, cluster adapted bitcoder and dimensional weight, named the BID+. Extensive experiments are conducted to show that our proposed method yields signifcant performance advantages over the existing index structures on both real life and synthetic high-dimensional datasets. |
format |
text |
author |
Cui, Bin Shen, Heng Tao SHEN, Jialie Tan, Kian-Lee |
author_facet |
Cui, Bin Shen, Heng Tao SHEN, Jialie Tan, Kian-Lee |
author_sort |
Cui, Bin |
title |
Exploring bit-difference for approximate KNN search in high-dimensional databases |
title_short |
Exploring bit-difference for approximate KNN search in high-dimensional databases |
title_full |
Exploring bit-difference for approximate KNN search in high-dimensional databases |
title_fullStr |
Exploring bit-difference for approximate KNN search in high-dimensional databases |
title_full_unstemmed |
Exploring bit-difference for approximate KNN search in high-dimensional databases |
title_sort |
exploring bit-difference for approximate knn search in high-dimensional databases |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2005 |
url |
https://ink.library.smu.edu.sg/sis_research/1298 https://ink.library.smu.edu.sg/context/sis_research/article/2297/viewcontent/CRPITV39CuiShen.pdf |
_version_ |
1770570942110498816 |