ACCELERATION OF K-NEAREST NEIGHBOR QUERY WITH TRIANGLE INEQUALITY PRUNING ON PRODUCT QUANTIZATION
This thesis conducts research to accelerate k-Nearest Neighbor (k-NN) query execution on the Product Quantization (PQ) algorithm. Product Quantization is an approximate kNN algorithm for large and high-dimensional datasets. The main idea behind PQ is to decompose the data vectors dimension into s...
Saved in:
Main Author: | |
---|---|
Format: | Final Project |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/61872 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
id |
id-itb.:61872 |
---|---|
spelling |
id-itb.:618722021-09-28T10:02:34ZACCELERATION OF K-NEAREST NEIGHBOR QUERY WITH TRIANGLE INEQUALITY PRUNING ON PRODUCT QUANTIZATION Edian, Ikraduya Indonesia Final Project k-nearest neighbor, product quantization, query acceleration INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/61872 This thesis conducts research to accelerate k-Nearest Neighbor (k-NN) query execution on the Product Quantization (PQ) algorithm. Product Quantization is an approximate kNN algorithm for large and high-dimensional datasets. The main idea behind PQ is to decompose the data vectors dimension into several small subdimensions, to then be quantized separately. The quantized and encoded data are used to approximate the distance between the query to each data point when query answering. The Triangle inequality theorem is used to accelerate PQ query execution by reducing unnecessary distance calculations. The aim is to show that the query response efficiency can be increased without reducing much accuracy of the nearest neighbor without relying on certain types of CPU-specific instructions. As with other methods, the key for the acceleration is SIMD instructions. The triangle inequality (TI) pruning technique was chosen as the acceleration technique because of its success in accelerating the k-NN brute-force algorithm while maintaining the search accuracy and without requiring SIMD instructions. This thesis analyzes how to develop and adapt the triangle inequality pruning technique to the Product Quantization algorithm. Experimental results using several datasets show that this technique successfully accelerates query times 6-7 fold faster with little or no compromising on the search accuracy if the appropriate parameter values are used. In addition, the triangle inequality pruning is quite competitive to speed up query time compared to other acceleration methods that rely on SIMD instructions in terms of query time and search accuracy. Finally, the use of the triangle inequality pruning technique does not limit the flexibility of the PQ configuration, and it does not have an issue with the algorithm portability, such as the method whose key to the acceleration depends on SIMD. text |
institution |
Institut Teknologi Bandung |
building |
Institut Teknologi Bandung Library |
continent |
Asia |
country |
Indonesia Indonesia |
content_provider |
Institut Teknologi Bandung |
collection |
Digital ITB |
language |
Indonesia |
description |
This thesis conducts research to accelerate k-Nearest Neighbor (k-NN) query
execution on the Product Quantization (PQ) algorithm. Product Quantization is an
approximate kNN algorithm for large and high-dimensional datasets. The main idea
behind PQ is to decompose the data vectors dimension into several small
subdimensions, to then be quantized separately. The quantized and encoded data
are used to approximate the distance between the query to each data point when
query answering.
The Triangle inequality theorem is used to accelerate PQ query execution by
reducing unnecessary distance calculations. The aim is to show that the query
response efficiency can be increased without reducing much accuracy of the nearest
neighbor without relying on certain types of CPU-specific instructions. As with
other methods, the key for the acceleration is SIMD instructions. The triangle
inequality (TI) pruning technique was chosen as the acceleration technique because
of its success in accelerating the k-NN brute-force algorithm while maintaining the
search accuracy and without requiring SIMD instructions. This thesis analyzes how
to develop and adapt the triangle inequality pruning technique to the Product
Quantization algorithm.
Experimental results using several datasets show that this technique successfully
accelerates query times 6-7 fold faster with little or no compromising on the search
accuracy if the appropriate parameter values are used. In addition, the triangle
inequality pruning is quite competitive to speed up query time compared to other
acceleration methods that rely on SIMD instructions in terms of query time and
search accuracy. Finally, the use of the triangle inequality pruning technique does
not limit the flexibility of the PQ configuration, and it does not have an issue with
the algorithm portability, such as the method whose key to the acceleration depends
on SIMD.
|
format |
Final Project |
author |
Edian, Ikraduya |
spellingShingle |
Edian, Ikraduya ACCELERATION OF K-NEAREST NEIGHBOR QUERY WITH TRIANGLE INEQUALITY PRUNING ON PRODUCT QUANTIZATION |
author_facet |
Edian, Ikraduya |
author_sort |
Edian, Ikraduya |
title |
ACCELERATION OF K-NEAREST NEIGHBOR QUERY WITH TRIANGLE INEQUALITY PRUNING ON PRODUCT QUANTIZATION |
title_short |
ACCELERATION OF K-NEAREST NEIGHBOR QUERY WITH TRIANGLE INEQUALITY PRUNING ON PRODUCT QUANTIZATION |
title_full |
ACCELERATION OF K-NEAREST NEIGHBOR QUERY WITH TRIANGLE INEQUALITY PRUNING ON PRODUCT QUANTIZATION |
title_fullStr |
ACCELERATION OF K-NEAREST NEIGHBOR QUERY WITH TRIANGLE INEQUALITY PRUNING ON PRODUCT QUANTIZATION |
title_full_unstemmed |
ACCELERATION OF K-NEAREST NEIGHBOR QUERY WITH TRIANGLE INEQUALITY PRUNING ON PRODUCT QUANTIZATION |
title_sort |
acceleration of k-nearest neighbor query with triangle inequality pruning on product quantization |
url |
https://digilib.itb.ac.id/gdl/view/61872 |
_version_ |
1822276372959068160 |