Backtracking spatial pyramid pooling-based image classifier for weakly supervised top–down salient object detection

Top-down (TD) saliency models produce a probability map that peaks at target locations specified by a task or goal such as object detection. They are usually trained in a fully supervised (FS) setting involving pixel-level annotations of objects. We propose a weakly supervised TD saliency framework...

Full description

Saved in:
Bibliographic Details
Main Authors: Cholakkal, Hisham, Johnson, Jubin, Rajan, Deepu
Other Authors: School of Computer Science and Engineering
Format: Article
Language:English
Published: 2020
Subjects:
Online Access:https://hdl.handle.net/10356/142295
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Top-down (TD) saliency models produce a probability map that peaks at target locations specified by a task or goal such as object detection. They are usually trained in a fully supervised (FS) setting involving pixel-level annotations of objects. We propose a weakly supervised TD saliency framework using only binary labels that indicate the presence or absence of an object in an image. First, the probabilistic contribution of each image region to the confidence of a convolutional neural network-based image classifier is computed through a backtracking strategy to produce TD saliency. From a set of saliency maps of an image produced by fast bottom-up (BU) saliency approaches, we select the best saliency map suitable for the TD task. The selected BU saliency map is combined with the TD saliency map. Features having high combined saliency are used to train a linear SVM classifier to estimate feature saliency. This is integrated with combined saliency and further refined through a multi-scale superpixel averaging of saliency map. We evaluate the performance of the proposed weakly supervised TD saliency and achieve comparable performance with FS approaches. Experiments are carried out on seven challenging datasets, and quantitative results are compared with 40 closely related approaches across four different applications.