Search disambiguation techniques in multimedia collections

In the last few years, we have witnessed an explosive growth in multimedia content. Online repositories such as Flickr (image) and YouTube (video) contain hundreds of millions of images and videos and are still growing by the day. However, the utility of these data sources is only as good as their a...

Full description

Saved in:

Bibliographic Details
Main Author:	Wan, Kong Wah
Other Authors:	Tan Ah Hwee
Format:	Theses and Dissertations
Language:	English
Published:	2012
Subjects:	DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval DRNTU::Engineering::Computer science and engineering::Computing methodologies::Document and text processing
Online Access:	https://hdl.handle.net/10356/50768
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-50768
record_format	dspace
spelling	sg-ntu-dr.10356-507682023-03-04T00:48:22Z Search disambiguation techniques in multimedia collections Wan, Kong Wah Tan Ah Hwee School of Computer Engineering DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval DRNTU::Engineering::Computer science and engineering::Computing methodologies::Document and text processing In the last few years, we have witnessed an explosive growth in multimedia content. Online repositories such as Flickr (image) and YouTube (video) contain hundreds of millions of images and videos and are still growing by the day. However, the utility of these data sources is only as good as their accessibility. A primary motivation of this thesis is towards better utilization of this overwhelming wealth of information. Specifically, we look at ways to disambiguate search in a multimedia retrieval system. The need for search disambiguation often arises in practice because the typical issued query is short, imprecise and ambiguous, and it is difficult to infer the exact query intent. Search disambiguation is the general class of techniques to elucidate the multi-faceted information needs arising from an ambiguous query. The objective is to provide an overview of search results grouped by multiple facets, which can better clarify a user's search intention by enabling him to zoom into any specific facet of the query topic. There has been a lack of a principled approach to search disambiguation. Many multimedia retrieval systems address the issue of ambiguous queries by results clustering, near-duplicate removal and diversification. The aim is to simply prevent result pages from being cluttered by too many similar Web articles. In this thesis, we take a more principled approach and propose the following two-prong methodology. First, we propose a novel faceted topic retrieval framework wherein multimedia documents are modeled as comprising of facets or topics. The notion of facets can be interpreted broadly to encompass any binary property of a document that represents a fact or a topic that is contained in the query need. In contrast with traditional term-based retrieval models (e.g. TF-IDF), documents are now ranked by the relevance of their composite facets/topics to the query. The goal is then to return a set of documents that are not only relevant to the query, they also cover the many different facets of the information need. We base our faceted retrieval framework on probabilistic topic models, a class of algorithms designed to discover the latent thematic structures in a document collection. Second, we augment our faceted retrieval framework with two modeling capabilities to better process ambiguous queries in multimedia collection such as video and images. Firstly, because different queries may have varying level of ambiguity, and hence varying level of polysemy in the return results, we develop a non-parametric Bayesian method to cluster search results. The main advantage of our clustering method is that the number of mixture components is not fixed a priori, but is determined during the posterior inference process. This allows our model to grow with the level of polysemy (and visual diversity) in the return multimedia results. Secondly, we extend the basic probabilistic topic model (the Latent Dirichlet Allocation, LDA) to jointly model the complementary information in the visual and textual streams. We show how the joint modeling can improve the quality of facet detection in news video, and in turn yield better user satisfaction in faceted topic retrieval. DOCTOR OF PHILOSOPHY (SCE) 2012-10-29T06:55:36Z 2012-10-29T06:55:36Z 2012 2012 Thesis Wan, K. W. (2012). Search disambiguation techniques in multimedia collections. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/50768 10.32657/10356/50768 en 197 p. application/pdf
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval DRNTU::Engineering::Computer science and engineering::Computing methodologies::Document and text processing
spellingShingle	DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval DRNTU::Engineering::Computer science and engineering::Computing methodologies::Document and text processing Wan, Kong Wah Search disambiguation techniques in multimedia collections
description	In the last few years, we have witnessed an explosive growth in multimedia content. Online repositories such as Flickr (image) and YouTube (video) contain hundreds of millions of images and videos and are still growing by the day. However, the utility of these data sources is only as good as their accessibility. A primary motivation of this thesis is towards better utilization of this overwhelming wealth of information. Specifically, we look at ways to disambiguate search in a multimedia retrieval system. The need for search disambiguation often arises in practice because the typical issued query is short, imprecise and ambiguous, and it is difficult to infer the exact query intent. Search disambiguation is the general class of techniques to elucidate the multi-faceted information needs arising from an ambiguous query. The objective is to provide an overview of search results grouped by multiple facets, which can better clarify a user's search intention by enabling him to zoom into any specific facet of the query topic. There has been a lack of a principled approach to search disambiguation. Many multimedia retrieval systems address the issue of ambiguous queries by results clustering, near-duplicate removal and diversification. The aim is to simply prevent result pages from being cluttered by too many similar Web articles. In this thesis, we take a more principled approach and propose the following two-prong methodology. First, we propose a novel faceted topic retrieval framework wherein multimedia documents are modeled as comprising of facets or topics. The notion of facets can be interpreted broadly to encompass any binary property of a document that represents a fact or a topic that is contained in the query need. In contrast with traditional term-based retrieval models (e.g. TF-IDF), documents are now ranked by the relevance of their composite facets/topics to the query. The goal is then to return a set of documents that are not only relevant to the query, they also cover the many different facets of the information need. We base our faceted retrieval framework on probabilistic topic models, a class of algorithms designed to discover the latent thematic structures in a document collection. Second, we augment our faceted retrieval framework with two modeling capabilities to better process ambiguous queries in multimedia collection such as video and images. Firstly, because different queries may have varying level of ambiguity, and hence varying level of polysemy in the return results, we develop a non-parametric Bayesian method to cluster search results. The main advantage of our clustering method is that the number of mixture components is not fixed a priori, but is determined during the posterior inference process. This allows our model to grow with the level of polysemy (and visual diversity) in the return multimedia results. Secondly, we extend the basic probabilistic topic model (the Latent Dirichlet Allocation, LDA) to jointly model the complementary information in the visual and textual streams. We show how the joint modeling can improve the quality of facet detection in news video, and in turn yield better user satisfaction in faceted topic retrieval.
author2	Tan Ah Hwee
author_facet	Tan Ah Hwee Wan, Kong Wah
format	Theses and Dissertations
author	Wan, Kong Wah
author_sort	Wan, Kong Wah
title	Search disambiguation techniques in multimedia collections
title_short	Search disambiguation techniques in multimedia collections
title_full	Search disambiguation techniques in multimedia collections
title_fullStr	Search disambiguation techniques in multimedia collections
title_full_unstemmed	Search disambiguation techniques in multimedia collections
title_sort	search disambiguation techniques in multimedia collections
publishDate	2012
url	https://hdl.handle.net/10356/50768
_version_	1759855489930231808

Search disambiguation techniques in multimedia collections

Similar Items