Efficient approximate nearest neighbor search on high-dimensional vectors by graph and quantization
Approximate nearest neighbor (ANN) search in high-dimensional Euclidean space has a broad range of applications. Among existing ANN algorithms, graph-based methods currently show superior performance in terms of the time-accuracy trade-off. Yet, they have performance bottlenecks due to the random me...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Master by Research |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/181650 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-181650 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1816502024-12-12T02:55:42Z Efficient approximate nearest neighbor search on high-dimensional vectors by graph and quantization Gou, Yutong Long Cheng College of Computing and Data Science c.long@ntu.edu.sg Computer and Information Science Approximate nearest neighbor (ANN) search in high-dimensional Euclidean space has a broad range of applications. Among existing ANN algorithms, graph-based methods currently show superior performance in terms of the time-accuracy trade-off. Yet, they have performance bottlenecks due to the random memory accesses caused by the searching process on the graph index and the cost of computing exact distances between high-dimensional vectors. To relieve the aforementioned issues, a recent method named NGT-QG makes an initial attempt by integrating quantization and graph. It (1) replicates and stores the quantization codes of a vertex's neighbors compactly so that they can be accessed sequentially; and (2) uses a SIMD-based implementation named FastScan to efficiently estimate distances based on the quantization codes in batch for guiding the searching process. While NGT-QG achieves promising improvements over the vanilla graph-based methods, it has not fully unleashed the potential of integrating quantization and graph. For instance, it entails a re-ranking step to compute exact distances at the end, which introduces extra random memory accesses; its graph structure is not jointly designed considering the in-batch nature of FastScan, which causes wastes of computation in searching. In this work, we present a new method named SymphonyQG, which targets more symphonious integration of quantization and graph (e.g., it avoids an explicit re-ranking step and refines the graph structure to be more aligned with FastScan). Based on our experimental results on several real-world datasets, SymphonyQG establishes a new state of the art in terms of the query performance. Master's degree 2024-12-12T02:55:41Z 2024-12-12T02:55:41Z 2024 Thesis-Master by Research Gou, Y. (2024). Efficient approximate nearest neighbor search on high-dimensional vectors by graph and quantization. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/181650 https://hdl.handle.net/10356/181650 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Computer and Information Science |
spellingShingle |
Computer and Information Science Gou, Yutong Efficient approximate nearest neighbor search on high-dimensional vectors by graph and quantization |
description |
Approximate nearest neighbor (ANN) search in high-dimensional Euclidean space has a broad range of applications. Among existing ANN algorithms, graph-based methods currently show superior performance in terms of the time-accuracy trade-off. Yet, they have performance bottlenecks due to the random memory accesses caused by the searching process on the graph index and the cost of computing exact distances between high-dimensional vectors. To relieve the aforementioned issues, a recent method named NGT-QG makes an initial attempt by integrating quantization and graph. It (1) replicates and stores the quantization codes of a vertex's neighbors compactly so that they can be accessed sequentially; and (2) uses a SIMD-based implementation named FastScan to efficiently estimate distances based on the quantization codes in batch for guiding the searching process. While NGT-QG achieves promising improvements over the vanilla graph-based methods, it has not fully unleashed the potential of integrating quantization and graph. For instance, it entails a re-ranking step to compute exact distances at the end, which introduces extra random memory accesses; its graph structure is not jointly designed considering the in-batch nature of FastScan, which causes wastes of computation in searching.
In this work, we present a new method named SymphonyQG, which targets more symphonious integration of quantization and graph (e.g., it avoids an explicit re-ranking step and refines the graph structure to be more aligned with FastScan). Based on our experimental results on several real-world datasets, SymphonyQG establishes a new state of the art in terms of the query performance. |
author2 |
Long Cheng |
author_facet |
Long Cheng Gou, Yutong |
format |
Thesis-Master by Research |
author |
Gou, Yutong |
author_sort |
Gou, Yutong |
title |
Efficient approximate nearest neighbor search on high-dimensional vectors by graph and quantization |
title_short |
Efficient approximate nearest neighbor search on high-dimensional vectors by graph and quantization |
title_full |
Efficient approximate nearest neighbor search on high-dimensional vectors by graph and quantization |
title_fullStr |
Efficient approximate nearest neighbor search on high-dimensional vectors by graph and quantization |
title_full_unstemmed |
Efficient approximate nearest neighbor search on high-dimensional vectors by graph and quantization |
title_sort |
efficient approximate nearest neighbor search on high-dimensional vectors by graph and quantization |
publisher |
Nanyang Technological University |
publishDate |
2024 |
url |
https://hdl.handle.net/10356/181650 |
_version_ |
1819113071875981312 |