Processing Incomplete k Nearest Neighbor Search

Given a setS of multidimensional objects and a query object q, a k nearest neighbor (kNN) query finds from S the k closest objects to q. This query is a fundamental problem in database, data mining, and information retrieval research. It plays an important role in a wide spectrum of real application...

Full description

Saved in:

Bibliographic Details
Main Authors:	MIAO, Xiaoye, GAO, Yunjun, CHEN, Gang, ZHENG, Baihua, CUI, Huiyong
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2016
Subjects:	k Nearest Neighbor Search Incomplete Data Query Processing Computer Sciences Theory and Algorithms
Online Access:	https://ink.library.smu.edu.sg/sis_research/3322 https://ink.library.smu.edu.sg/context/sis_research/article/4324/viewcontent/ProcessingIncompletek.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-4324
record_format	dspace
spelling	sg-smu-ink.sis_research-43242017-04-25T04:56:56Z Processing Incomplete k Nearest Neighbor Search MIAO, Xiaoye GAO, Yunjun CHEN, Gang ZHENG, Baihua CUI, Huiyong Given a setS of multidimensional objects and a query object q, a k nearest neighbor (kNN) query finds from S the k closest objects to q. This query is a fundamental problem in database, data mining, and information retrieval research. It plays an important role in a wide spectrum of real applications such as image recognition and location-based services. However, due to the failure of data transmission devices, improper storage, and accidental loss, incomplete data exist widely in those applications, where some dimensional values of data items are missing. In this paper, we systematically study incomplete k nearest neighbor (IkNN) search, which aims at the kNN query for incomplete data. We formalize this problem and propose an efficient lattice partition algorithm using our newly developed LαB index to support exact IkNN retrieval, with the help of two pruning heuristics, i.e., α value pruning and partial distance pruning. Furthermore, we propose an approximate algorithm, namely histogram approximate, to support approximate IkNN search with improved search efficiency and guaranteed error bound. Extensive experiments using both real and synthetic datasets demonstrate the effectiveness of newly designed indexes and pruning heuristics, as well as the performance of our presented algorithms under a variety of experimental settings. 2016-12-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/3322 info:doi/10.1109/TFUZZ.2016.2516562 https://ink.library.smu.edu.sg/context/sis_research/article/4324/viewcontent/ProcessingIncompletek.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University k Nearest Neighbor Search Incomplete Data Query Processing Computer Sciences Theory and Algorithms
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	k Nearest Neighbor Search Incomplete Data Query Processing Computer Sciences Theory and Algorithms
spellingShingle	k Nearest Neighbor Search Incomplete Data Query Processing Computer Sciences Theory and Algorithms MIAO, Xiaoye GAO, Yunjun CHEN, Gang ZHENG, Baihua CUI, Huiyong Processing Incomplete k Nearest Neighbor Search
description	Given a setS of multidimensional objects and a query object q, a k nearest neighbor (kNN) query finds from S the k closest objects to q. This query is a fundamental problem in database, data mining, and information retrieval research. It plays an important role in a wide spectrum of real applications such as image recognition and location-based services. However, due to the failure of data transmission devices, improper storage, and accidental loss, incomplete data exist widely in those applications, where some dimensional values of data items are missing. In this paper, we systematically study incomplete k nearest neighbor (IkNN) search, which aims at the kNN query for incomplete data. We formalize this problem and propose an efficient lattice partition algorithm using our newly developed LαB index to support exact IkNN retrieval, with the help of two pruning heuristics, i.e., α value pruning and partial distance pruning. Furthermore, we propose an approximate algorithm, namely histogram approximate, to support approximate IkNN search with improved search efficiency and guaranteed error bound. Extensive experiments using both real and synthetic datasets demonstrate the effectiveness of newly designed indexes and pruning heuristics, as well as the performance of our presented algorithms under a variety of experimental settings.
format	text
author	MIAO, Xiaoye GAO, Yunjun CHEN, Gang ZHENG, Baihua CUI, Huiyong
author_facet	MIAO, Xiaoye GAO, Yunjun CHEN, Gang ZHENG, Baihua CUI, Huiyong
author_sort	MIAO, Xiaoye
title	Processing Incomplete k Nearest Neighbor Search
title_short	Processing Incomplete k Nearest Neighbor Search
title_full	Processing Incomplete k Nearest Neighbor Search
title_fullStr	Processing Incomplete k Nearest Neighbor Search
title_full_unstemmed	Processing Incomplete k Nearest Neighbor Search
title_sort	processing incomplete k nearest neighbor search
publisher	Institutional Knowledge at Singapore Management University
publishDate	2016
url	https://ink.library.smu.edu.sg/sis_research/3322 https://ink.library.smu.edu.sg/context/sis_research/article/4324/viewcontent/ProcessingIncompletek.pdf
_version_	1770573112826396672

Processing Incomplete k Nearest Neighbor Search

Similar Items