Visual understanding and personalization for an optimal recollection experience

The affordability of wearable cameras such as the Narrative Clip and GoPro allows mass-market consumers to continuously record their lives, producing large amounts of unstructured visual data. Moreover, users tend to record with their smartphones more multimedia content than they can possibly share...

Full description

Saved in:

Bibliographic Details
Main Author:	Ana, Garcia del Molino
Other Authors:	Tan Ah Hwee
Format:	Theses and Dissertations
Language:	English
Published:	2019
Subjects:	Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision Engineering::Computer science and engineering::Computing methodologies::Pattern recognition
Online Access:	https://hdl.handle.net/10356/82932 http://hdl.handle.net/10220/50437
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-82932
record_format	dspace
spelling	sg-ntu-dr.10356-829322020-10-28T08:40:50Z Visual understanding and personalization for an optimal recollection experience Ana, Garcia del Molino Tan Ah Hwee School of Computer Science and Engineering A*STAR Centre for Computational Intelligence Lim Joo Hwee Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision Engineering::Computer science and engineering::Computing methodologies::Pattern recognition The affordability of wearable cameras such as the Narrative Clip and GoPro allows mass-market consumers to continuously record their lives, producing large amounts of unstructured visual data. Moreover, users tend to record with their smartphones more multimedia content than they can possibly share or review. We use each of these devices for different purposes: action cameras for travels and adventures; our smartphones to capture on the spur of the moment; a lifelogging device to record unobtrusively all our daily life activities. As a result, the few important shots end up buried among many repetitive images or uninteresting long segments, requiring hours of manual analysis in order to, say, select highlights in a day or find the most aesthetic pictures. Tackling challenges in end-to-end consumer video summarization, this thesis contributes to the state of the art in three major aspects: (i) Contextual Event Segmentation, an episodic event segmentation method that is able to detect boundaries between heterogeneous events and ignore local occlusions and brief diversions. CES improves the performance of the baselines by over 16% in F-measure, and is competitive with manual annotations. (ii) Personalized Highlight Detection, a highlight detector that is personalized via its inputs. The experimental results show that using the user history substantially improves the prediction accuracy. PHD outperforms the user-agnostic baselines even with only one single person-specific example. (iii) Active Video Summarization, an interactive approach to video exploration that gathers the user’s preferences while creating a video summary. AVS achieves an excellent compromise between usability and quality. The diverse and uniform nature of AVS summaries makes it alsoa valuable tool for browsing someone else’s visual collection. Additionally, this thesis contributes two large-scale datasets for First Person View video analysis, CSumm and R3, and a large-scale dataset for personalized video highlights, PHD2. Doctor of Philosophy 2019-11-19T00:52:58Z 2019-12-06T15:08:33Z 2019-11-19T00:52:58Z 2019-12-06T15:08:33Z 2019 Thesis Ana, G. d. M. (2019). Visual understanding and personalization for an optimal recollection experience. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/82932 http://hdl.handle.net/10220/50437 10.32657/10356/82932 en 133 p. application/pdf
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision Engineering::Computer science and engineering::Computing methodologies::Pattern recognition
spellingShingle	Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision Engineering::Computer science and engineering::Computing methodologies::Pattern recognition Ana, Garcia del Molino Visual understanding and personalization for an optimal recollection experience
description	The affordability of wearable cameras such as the Narrative Clip and GoPro allows mass-market consumers to continuously record their lives, producing large amounts of unstructured visual data. Moreover, users tend to record with their smartphones more multimedia content than they can possibly share or review. We use each of these devices for different purposes: action cameras for travels and adventures; our smartphones to capture on the spur of the moment; a lifelogging device to record unobtrusively all our daily life activities. As a result, the few important shots end up buried among many repetitive images or uninteresting long segments, requiring hours of manual analysis in order to, say, select highlights in a day or find the most aesthetic pictures. Tackling challenges in end-to-end consumer video summarization, this thesis contributes to the state of the art in three major aspects: (i) Contextual Event Segmentation, an episodic event segmentation method that is able to detect boundaries between heterogeneous events and ignore local occlusions and brief diversions. CES improves the performance of the baselines by over 16% in F-measure, and is competitive with manual annotations. (ii) Personalized Highlight Detection, a highlight detector that is personalized via its inputs. The experimental results show that using the user history substantially improves the prediction accuracy. PHD outperforms the user-agnostic baselines even with only one single person-specific example. (iii) Active Video Summarization, an interactive approach to video exploration that gathers the user’s preferences while creating a video summary. AVS achieves an excellent compromise between usability and quality. The diverse and uniform nature of AVS summaries makes it alsoa valuable tool for browsing someone else’s visual collection. Additionally, this thesis contributes two large-scale datasets for First Person View video analysis, CSumm and R3, and a large-scale dataset for personalized video highlights, PHD2.
author2	Tan Ah Hwee
author_facet	Tan Ah Hwee Ana, Garcia del Molino
format	Theses and Dissertations
author	Ana, Garcia del Molino
author_sort	Ana, Garcia del Molino
title	Visual understanding and personalization for an optimal recollection experience
title_short	Visual understanding and personalization for an optimal recollection experience
title_full	Visual understanding and personalization for an optimal recollection experience
title_fullStr	Visual understanding and personalization for an optimal recollection experience
title_full_unstemmed	Visual understanding and personalization for an optimal recollection experience
title_sort	visual understanding and personalization for an optimal recollection experience
publishDate	2019
url	https://hdl.handle.net/10356/82932 http://hdl.handle.net/10220/50437
_version_	1683493528032772096

Visual understanding and personalization for an optimal recollection experience

Similar Items