Context-aware mobile image recognition and annotation

The growing usage of mobile camera phones has led to proliferation of many mobile applications, such as mobile city guide, mobile shopping, personalized mobile service, and personal album management. Mobile visual systems have been developed which analyze images taken by mobile devices to enable the...

Full description

Saved in:

Bibliographic Details
Main Author:	Li, Zhen
Other Authors:	Yap Kim Hui
Format:	Theses and Dissertations
Language:	English
Published:	2013
Subjects:	DRNTU::Engineering::Computer science and engineering::Information systems
Online Access:	https://hdl.handle.net/10356/55100
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-55100
record_format	dspace
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	DRNTU::Engineering::Computer science and engineering::Information systems
spellingShingle	DRNTU::Engineering::Computer science and engineering::Information systems Li, Zhen Context-aware mobile image recognition and annotation
description	The growing usage of mobile camera phones has led to proliferation of many mobile applications, such as mobile city guide, mobile shopping, personalized mobile service, and personal album management. Mobile visual systems have been developed which analyze images taken by mobile devices to enable these mobile applications. Amongst these applications, there are two important ones: 1) mobile image recognition which provides relevant information for the scene/landmark images, and 2) mobile image annotation that uses camera phones to capture images and annotate them. Mobile image recognition and annotation are closely related, and are based on mobile visual analysis. In order to enhance the performance of mobile visual system, it is natural to incorporate the mobile domain-specific context information to the conventional visual content analysis. The context information in this work includes location and direction information on mobile devices, mobile user interaction, etc. However, context information is underutilized in most of the existing mobile visual systems. Existing mobile visual systems mainly use location information provided by GPS (Global Positioning System) to obtain the candidate images located near the current location of the query image, and then carry out content analysis within the shortlisted candidates to obtain the final recognition/annotation results. This is insufficient since (i) GPS is not that reliable due to its large errors in dense build-up areas, and (ii) other context information such as direction (recorded by digital compass on mobile device) is not utilized to further improve recognition. For mobile image recognition, we proposed several approaches based on content analysis with possible incorporation of context information: 1) A new approach for scene image recognition is proposed by combining generative models and discriminative models. A new image signature is proposed based on Gaussian Mixture Model (GMM), and its soft relevance value is incorporated into training of Fuzzy Support Vector Machine (FSVM). By using the proposed GMM-FSVM approach, the recognition performance is shown to be superior to state-of-the-art Bag-of-Words (BoW) methods. 2) A new landmark image recognition method is proposed that can incorporate saliency information of images to the state-of-the-art Scalable Vocabulary Tree (SVT) approach. Since the saliency information emphasizes the foreground landmark object and ignores the cluttered background, recognition performance of the proposed Saliency-Aware Vocabulary Tree (SAVT) algorithm is improved relative to the baseline SVT approach. 3) We propose a real-valued multi-class adaboost algorithm using exponential loss function (RMAE), which can integrate visual content and two types of mobile context: location and direction. RMAE generates SVTs based on content and context analysis, respectively, and then constructs weak classifiers based on them, followed by the final strong classifier construction based on the weak classifiers which contains both content and context information. For mobile image annotation, we developed a system prototype and proposed several approaches by utilizing content analysis, context analysis and their integration: 2) To study the effectiveness of context-based image annotation, a new algorithm is proposed by modeling the tag distributions over different GPS locations of the mobile images. Specifically, the tag distributions are obtained by using an enhanced GMM. Based on the tag distributions, a query image can be associated to tags according to its location, thus achieving context-based image annotation. As part of the contributions, we have also constructed two mobile image databases: i) Singapore Landmark-40 dataset for recognition, and ii) NTU Scene-25 dataset for annotation. Singapore Landmark-40 datasets consists of 12,338 training images and 1,200 testing images for 40 famous landmarks in Singapore. NTU Scene-25 dataset consists of 3916 images in 25 categories of geotagged scenes/landmarks/activities from the campus in NTU. This dataset include various context information such as GPS location and direction. Comprehensive experiments have been carried on a number of mobile image datasets, and experimental results show that the proposed mobile image recognition and annotation methods outperform the state-of-the-art methods, and shows good potential in mobile image sharing based on recognition and annotation.
author2	Yap Kim Hui
author_facet	Yap Kim Hui Li, Zhen
format	Theses and Dissertations
author	Li, Zhen
author_sort	Li, Zhen
title	Context-aware mobile image recognition and annotation
title_short	Context-aware mobile image recognition and annotation
title_full	Context-aware mobile image recognition and annotation
title_fullStr	Context-aware mobile image recognition and annotation
title_full_unstemmed	Context-aware mobile image recognition and annotation
title_sort	context-aware mobile image recognition and annotation
publishDate	2013
url	https://hdl.handle.net/10356/55100
_version_	1772827815855521792
spelling	sg-ntu-dr.10356-551002023-07-04T17:10:55Z Context-aware mobile image recognition and annotation Li, Zhen Yap Kim Hui School of Electrical and Electronic Engineering DRNTU::Engineering::Computer science and engineering::Information systems The growing usage of mobile camera phones has led to proliferation of many mobile applications, such as mobile city guide, mobile shopping, personalized mobile service, and personal album management. Mobile visual systems have been developed which analyze images taken by mobile devices to enable these mobile applications. Amongst these applications, there are two important ones: 1) mobile image recognition which provides relevant information for the scene/landmark images, and 2) mobile image annotation that uses camera phones to capture images and annotate them. Mobile image recognition and annotation are closely related, and are based on mobile visual analysis. In order to enhance the performance of mobile visual system, it is natural to incorporate the mobile domain-specific context information to the conventional visual content analysis. The context information in this work includes location and direction information on mobile devices, mobile user interaction, etc. However, context information is underutilized in most of the existing mobile visual systems. Existing mobile visual systems mainly use location information provided by GPS (Global Positioning System) to obtain the candidate images located near the current location of the query image, and then carry out content analysis within the shortlisted candidates to obtain the final recognition/annotation results. This is insufficient since (i) GPS is not that reliable due to its large errors in dense build-up areas, and (ii) other context information such as direction (recorded by digital compass on mobile device) is not utilized to further improve recognition. For mobile image recognition, we proposed several approaches based on content analysis with possible incorporation of context information: 1) A new approach for scene image recognition is proposed by combining generative models and discriminative models. A new image signature is proposed based on Gaussian Mixture Model (GMM), and its soft relevance value is incorporated into training of Fuzzy Support Vector Machine (FSVM). By using the proposed GMM-FSVM approach, the recognition performance is shown to be superior to state-of-the-art Bag-of-Words (BoW) methods. 2) A new landmark image recognition method is proposed that can incorporate saliency information of images to the state-of-the-art Scalable Vocabulary Tree (SVT) approach. Since the saliency information emphasizes the foreground landmark object and ignores the cluttered background, recognition performance of the proposed Saliency-Aware Vocabulary Tree (SAVT) algorithm is improved relative to the baseline SVT approach. 3) We propose a real-valued multi-class adaboost algorithm using exponential loss function (RMAE), which can integrate visual content and two types of mobile context: location and direction. RMAE generates SVTs based on content and context analysis, respectively, and then constructs weak classifiers based on them, followed by the final strong classifier construction based on the weak classifiers which contains both content and context information. For mobile image annotation, we developed a system prototype and proposed several approaches by utilizing content analysis, context analysis and their integration: 2) To study the effectiveness of context-based image annotation, a new algorithm is proposed by modeling the tag distributions over different GPS locations of the mobile images. Specifically, the tag distributions are obtained by using an enhanced GMM. Based on the tag distributions, a query image can be associated to tags according to its location, thus achieving context-based image annotation. As part of the contributions, we have also constructed two mobile image databases: i) Singapore Landmark-40 dataset for recognition, and ii) NTU Scene-25 dataset for annotation. Singapore Landmark-40 datasets consists of 12,338 training images and 1,200 testing images for 40 famous landmarks in Singapore. NTU Scene-25 dataset consists of 3916 images in 25 categories of geotagged scenes/landmarks/activities from the campus in NTU. This dataset include various context information such as GPS location and direction. Comprehensive experiments have been carried on a number of mobile image datasets, and experimental results show that the proposed mobile image recognition and annotation methods outperform the state-of-the-art methods, and shows good potential in mobile image sharing based on recognition and annotation. DOCTOR OF PHILOSOPHY (EEE) 2013-12-12T07:22:28Z 2013-12-12T07:22:28Z 2013 2013 Thesis Li, Z. (2013). Context-aware mobile image recognition and annotation. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/55100 10.32657/10356/55100 en 174 p. application/pdf

Context-aware mobile image recognition and annotation

Similar Items