Approximate k-NN graph construction: A generic online approach

Nearest neighbor search and k-nearest neighbor graph construction are two fundamental issues that arise from many disciplines such as multimedia information retrieval, data-mining, and machine learning. They become more and more imminent given the big data emerge in various fields in recent years. I...

Full description

Saved in:
Bibliographic Details
Main Authors: ZHAO, Wan-Lei, WANG, Hui, NGO, Chong-wah
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2022
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/7244
https://ink.library.smu.edu.sg/context/sis_research/article/8247/viewcontent/Approximate_k_NN_Graph_Construction.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:Nearest neighbor search and k-nearest neighbor graph construction are two fundamental issues that arise from many disciplines such as multimedia information retrieval, data-mining, and machine learning. They become more and more imminent given the big data emerge in various fields in recent years. In this paper, a simple but effective solution both for approximate k-nearest neighbor search and approximate k-nearest neighbor graph construction is presented. These two issues are addressed jointly in our solution. On one hand, the approximate k-nearest neighbor graph construction is treated as a search task. Each sample along with its k-nearest neighbors is joined into the k-nearest neighbor graph by performing the nearest neighbor search sequentially on the graph under construction. On the other hand, the built k-nearest neighbor graph is used to support k-nearest neighbor search. Since the graph is built online, the dynamic update on the graph, which is not possible for most of the existing solutions, is supported. This solution is feasible for various distance measures. Its effectiveness both as k-nearest neighbor construction and k-nearest neighbor search approaches is verified across different types of data in different scales, various dimensions, and under different metrics.