Visual saliency detection in image and video data

The study of psychology and cognitive science has shown that the human perception is selective. When seeing images or watching videos, we mainly focus on a salient sub-region of an image, e.g. the salient object, and follow this salient object in image sequences. Efficient saliency analysis and accu...

Full description

Saved in:
Bibliographic Details
Main Author: Luo, Ye
Other Authors: Xue Ping
Format: Theses and Dissertations
Language:English
Published: 2014
Subjects:
Online Access:https://hdl.handle.net/10356/61800
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:The study of psychology and cognitive science has shown that the human perception is selective. When seeing images or watching videos, we mainly focus on a salient sub-region of an image, e.g. the salient object, and follow this salient object in image sequences. Efficient saliency analysis and accurate salient object detection are fundamental problems to various computer vision applications, i.e. image\slash video retargeting, object recognition and etc. In this thesis, a systematic study is performed for visual saliency analysis in images and videos. First of all, in order to identify the important visual contents in videos, we propose a novel video saliency estimation method by fusing spatio-temporally selected sparse features. Since the proposed method can accurately represent each frame by the learned dictionary and consistently incorporate the temporal information, our method outperforms three state-of-the-art methods and achieves good performance on two public datasets. Although different saliency models have been proposed for images/videos, due to the cluttered background, it is not easy to accurately locate the salient object and crop it out from a noisy saliency map. We further propose a novel saliency density maximization method to detect salient objects in saliency maps. Without a prior knowledge of the salient object, our method can adapt to different sizes and shapes of the objects, and is less sensitive to the cluttered background. Moreover, considering that the temporal coherence of salient objects in consecutive frames is usually strong, we extend salient object detection from images to videos by formulating salient object detection as a salient path discovery problem. A global optimal solution can be obtained by the proposed dynamic programming algorithm. The comparisons with two state-of-the-art object detection methods and one tracking method further demonstrate the efficiency of the proposed method on salient object detection in videos. At last, similar to salient object in an image, many videos contain the thematic objects (e.g. the bride and the groom in a wedding ceremony video), which appear frequently in a video scene and thus retain our impression after watching it. We propose a sub-graph mining method which incorporates the spatio-temporal video context to find the thematic objects and label the salient positions that thematic objects appear. Experimental results on two public datasets and one self-collected eye-tracking dataset show the efficacy of the proposed method on thematic object detection.