Distribution-Based Similarity Measures for Multi-Dimensional Point Set Retrieval Applications

Effective and efficient method of similarity assessment continues to be one of the most fundamental problems in multimedia data analysis. In case of retrieving relevant items from a collection of objects based on series of multivariate observations (e.g., searching the similar video clips in a repos...

Full description

Saved in:
Bibliographic Details
Main Authors: SHAO, Jie, HUANG, Zi, SHEN, Heng Tao, SHEN, Jialie, ZHOU, Xiaofang
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2008
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/574
http://dx.doi.org/10.1145/1459359.1459417
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:Effective and efficient method of similarity assessment continues to be one of the most fundamental problems in multimedia data analysis. In case of retrieving relevant items from a collection of objects based on series of multivariate observations (e.g., searching the similar video clips in a repository to a query example), satisfactory performance cannot be expected using many conventional similarity measures based on the aggregation of element pairwise comparisons. Some correlation information among the individual elements has also been investigated to characterize each set of multi-dimensional points for ranked retrieval, by making use of an unwarranted assumption that the underlying data distribution has a particular parametric form. Motivated by this observation, this paper introduces a novel collective gauge of relevance ranking by evaluating the probabilities that point sets are consistent with the same distribution of the query. Two non-parametric hypothesis tests in statistics are justified to exploit the distributional discrepancy of samples for assessing the similarity between two ensembles of points. While our methodology is mainly presented in the context of video similarity search, it enjoys great flexibility and can be easily adapted to other applications involving generic multi-dimensional point set representation for each object such as human gesture recognition.