Exploring bottom-up and top-down cues with attentive learning for webly supervised object detection

Fully supervised object detection has achieved great success in recent years. However, abundant bounding boxes annotations are needed for training a detector for novel classes. To reduce the human labeling effort, we propose a novel webly supervised object detection (WebSOD) method for novel classes...

Full description

Saved in:

Bibliographic Details
Main Authors:	Wu, Zhonghua, Tao, Qingyi, Lin, Guosheng, Cai, Jianfei
Other Authors:	School of Computer Science and Engineering
Format:	Conference or Workshop Item
Language:	English
Published:	2020
Subjects:	Engineering::Computer science and engineering Object Detection Detectors
Online Access:	https://hdl.handle.net/10356/144343
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-144343
record_format	dspace
spelling	sg-ntu-dr.10356-1443432020-10-29T05:20:10Z Exploring bottom-up and top-down cues with attentive learning for webly supervised object detection Wu, Zhonghua Tao, Qingyi Lin, Guosheng Cai, Jianfei School of Computer Science and Engineering IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2020 Engineering::Computer science and engineering Object Detection Detectors Fully supervised object detection has achieved great success in recent years. However, abundant bounding boxes annotations are needed for training a detector for novel classes. To reduce the human labeling effort, we propose a novel webly supervised object detection (WebSOD) method for novel classes which only requires the web images without further annotations. Our proposed method combines bottom-up and top-down cues for novel class detection. Within our approach, we introduce a bottom-up mechanism based on the well-trained fully supervised object detector (i.e. Faster RCNN) as an object region estimator for web images by recognizing the common objectiveness shared by base and novel classes. With the estimated regions on the web images, we then utilize the top-down attention cues as the guidance for region classification. Furthermore, we propose a residual feature refinement (RFR) block to tackle the domain mismatch between web domain and the target domain. We demonstrate our proposed method on PASCAL VOC dataset with three different novel/base splits. Without any target-domain novel-class images and annotations, our proposed webly supervised object detection model is able to achieve promising performance for novel classes. Moreover, we also conduct transfer learning experiments on large scale ILSVRC 2013 detection dataset and achieve state-of-the-art performance. AI Singapore National Research Foundation (NRF) Accepted version This research was mainly carried out at the Rapid-Rich Object Search (ROSE) Lab at the Nanyang Technological University, Singapore. The ROSE Lab is supported by the National Research Foundation, Singapore, and the Infocomm Media Development Authority, Singapore. This research is also partially supported by the National Research Foundation Singapore under its AI Singapore Programme (Award Number: AISG-RP-2018-003), the MOE Tier-1 research grants: RG28/18 (S) and RG22/19 (S) and the Monash FIT Start-up Grant. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not reflect the views of National Research Foundation, Singapore. 2020-10-29T05:20:10Z 2020-10-29T05:20:10Z 2020 Conference Paper Wu, Z., Tao, Q., Lin, G., & Cai, J. (2020). Exploring bottom-up and top-down cues with attentive learning for webly supervised object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2020. doi:10.1109/CVPR42600.2020.01295 https://hdl.handle.net/10356/144343 10.1109/CVPR42600.2020.01295 en AISG-RP-2018-003 © 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: https://doi.org/10.1109/CVPR42600.2020.01295 application/pdf
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Computer science and engineering Object Detection Detectors
spellingShingle	Engineering::Computer science and engineering Object Detection Detectors Wu, Zhonghua Tao, Qingyi Lin, Guosheng Cai, Jianfei Exploring bottom-up and top-down cues with attentive learning for webly supervised object detection
description	Fully supervised object detection has achieved great success in recent years. However, abundant bounding boxes annotations are needed for training a detector for novel classes. To reduce the human labeling effort, we propose a novel webly supervised object detection (WebSOD) method for novel classes which only requires the web images without further annotations. Our proposed method combines bottom-up and top-down cues for novel class detection. Within our approach, we introduce a bottom-up mechanism based on the well-trained fully supervised object detector (i.e. Faster RCNN) as an object region estimator for web images by recognizing the common objectiveness shared by base and novel classes. With the estimated regions on the web images, we then utilize the top-down attention cues as the guidance for region classification. Furthermore, we propose a residual feature refinement (RFR) block to tackle the domain mismatch between web domain and the target domain. We demonstrate our proposed method on PASCAL VOC dataset with three different novel/base splits. Without any target-domain novel-class images and annotations, our proposed webly supervised object detection model is able to achieve promising performance for novel classes. Moreover, we also conduct transfer learning experiments on large scale ILSVRC 2013 detection dataset and achieve state-of-the-art performance.
author2	School of Computer Science and Engineering
author_facet	School of Computer Science and Engineering Wu, Zhonghua Tao, Qingyi Lin, Guosheng Cai, Jianfei
format	Conference or Workshop Item
author	Wu, Zhonghua Tao, Qingyi Lin, Guosheng Cai, Jianfei
author_sort	Wu, Zhonghua
title	Exploring bottom-up and top-down cues with attentive learning for webly supervised object detection
title_short	Exploring bottom-up and top-down cues with attentive learning for webly supervised object detection
title_full	Exploring bottom-up and top-down cues with attentive learning for webly supervised object detection
title_fullStr	Exploring bottom-up and top-down cues with attentive learning for webly supervised object detection
title_full_unstemmed	Exploring bottom-up and top-down cues with attentive learning for webly supervised object detection
title_sort	exploring bottom-up and top-down cues with attentive learning for webly supervised object detection
publishDate	2020
url	https://hdl.handle.net/10356/144343
_version_	1683492972055756800

Exploring bottom-up and top-down cues with attentive learning for webly supervised object detection

Similar Items