Interactive hierarchical object proposals

Object proposal algorithms have been demonstrated to be very successful in accelerating object detection process. High object localization quality and detection recall can be obtained using thousands of proposals. However, the performance with a small number of proposals is still unsatisfactory. Thi...

Full description

Saved in:
Bibliographic Details
Main Authors: CHEN, Mingliang, ZHANG, Jiawei, HE, Shengfeng, YANG, Qingxiong, LI, Qing, YANG, Ming-Hsuan
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2019
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/7864
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:Object proposal algorithms have been demonstrated to be very successful in accelerating object detection process. High object localization quality and detection recall can be obtained using thousands of proposals. However, the performance with a small number of proposals is still unsatisfactory. This paper demonstrates that the performance of a few proposals can be significantly improved with the minimal human interaction-a single touch point. To this end, we first generate hierarchical superpixels using an efficient tree-organized structure as our initial object proposals, and then select only a few proposals from them by learning an effective Convolutional neural network for objectness ranking. We explore and design an architecture to integrate human interaction with the global information of the whole image for objectness scoring, which is able to significantly improve the performance with a minimum number of object proposals. Extensive experiments show the proposed method outperforms all the state-of-the-art methods for locating the meaningful object with the touch point constraint. Furthermore, the proposed method is extended for video. By combining with the novel interactive motion segmentation cue for generating hierarchical superpixels, the performance on a single proposal is satisfactory and can be used in the interactive vision systems, such as selecting the input of a real-time tracking system.