Learning unsupervised video object segmentation through visual attention

This paper conducts a systematic study on the role of visual attention in Unsupervised Video Object Segmentation (UVOS) tasks. By elaborately annotating three popular video segmentation datasets (DAVIS, Youtube-Objects and SegTrack V2) with dynamic eye-tracking data in the UVOS setting, for the firs...

Full description

Saved in:

Bibliographic Details
Main Authors:	WANG, Wenguan, SONG, Hongmei, ZHAO, Shuyang, SHEN, Jianbing, ZHAO, Sanyuan, HOI, Steven C. H., LING, Haibin
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2019
Subjects:	Segmentation Grouping and Shape Image and Video Synthesis Databases and Information Systems
Online Access:	https://ink.library.smu.edu.sg/sol_research/3162 https://ink.library.smu.edu.sg/context/sol_research/article/5120/viewcontent/UVOS_cvpr19_av.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sol_research-5120
record_format	dspace
spelling	sg-smu-ink.sol_research-51202020-07-02T11:08:19Z Learning unsupervised video object segmentation through visual attention WANG, Wenguan SONG, Hongmei ZHAO, Shuyang SHEN, Jianbing ZHAO, Sanyuan HOI, Steven C. H. LING, Haibin This paper conducts a systematic study on the role of visual attention in Unsupervised Video Object Segmentation (UVOS) tasks. By elaborately annotating three popular video segmentation datasets (DAVIS, Youtube-Objects and SegTrack V2) with dynamic eye-tracking data in the UVOS setting, for the first time, we quantitatively verified the high consistency of visual attention behavior among human observers, and found strong correlation between human attention and explicit primary object judgements during dynamic, task-driven viewing. Such novel observations provide an in-depth insight into the underlying rationale behind UVOS. Inspired by these findings, we decouple UVOS into two sub-tasks: UVOS-driven Dynamic Visual Attention Prediction (DVAP) in spatiotemporal domain, and Attention-Guided Object Segmentation (AGOS) in spatial domain. Our UVOS solution enjoys three major merits: 1) modular training without using expensive video segmentation annotations, instead, using more affordable dynamic fixation data to train the initial video attention module and using existing fixation-segmentation paired static/image data to train the subsequent segmentation module; 2) comprehensive foreground understanding through multi-source learning; and 3) additional interpretability from the biologically-inspired and assessable attention. Experiments on popular benchmarks show that, even without using expensive video object mask annotations, our model achieves compelling performance in comparison with state-of-the-arts. 2019-06-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sol_research/3162 info:doi/10.1109/CVPR.2019.00318 https://ink.library.smu.edu.sg/context/sol_research/article/5120/viewcontent/UVOS_cvpr19_av.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection Yong Pung How School Of Law eng Institutional Knowledge at Singapore Management University Segmentation Grouping and Shape Image and Video Synthesis Databases and Information Systems
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Segmentation Grouping and Shape Image and Video Synthesis Databases and Information Systems
spellingShingle	Segmentation Grouping and Shape Image and Video Synthesis Databases and Information Systems WANG, Wenguan SONG, Hongmei ZHAO, Shuyang SHEN, Jianbing ZHAO, Sanyuan HOI, Steven C. H. LING, Haibin Learning unsupervised video object segmentation through visual attention
description	This paper conducts a systematic study on the role of visual attention in Unsupervised Video Object Segmentation (UVOS) tasks. By elaborately annotating three popular video segmentation datasets (DAVIS, Youtube-Objects and SegTrack V2) with dynamic eye-tracking data in the UVOS setting, for the first time, we quantitatively verified the high consistency of visual attention behavior among human observers, and found strong correlation between human attention and explicit primary object judgements during dynamic, task-driven viewing. Such novel observations provide an in-depth insight into the underlying rationale behind UVOS. Inspired by these findings, we decouple UVOS into two sub-tasks: UVOS-driven Dynamic Visual Attention Prediction (DVAP) in spatiotemporal domain, and Attention-Guided Object Segmentation (AGOS) in spatial domain. Our UVOS solution enjoys three major merits: 1) modular training without using expensive video segmentation annotations, instead, using more affordable dynamic fixation data to train the initial video attention module and using existing fixation-segmentation paired static/image data to train the subsequent segmentation module; 2) comprehensive foreground understanding through multi-source learning; and 3) additional interpretability from the biologically-inspired and assessable attention. Experiments on popular benchmarks show that, even without using expensive video object mask annotations, our model achieves compelling performance in comparison with state-of-the-arts.
format	text
author	WANG, Wenguan SONG, Hongmei ZHAO, Shuyang SHEN, Jianbing ZHAO, Sanyuan HOI, Steven C. H. LING, Haibin
author_facet	WANG, Wenguan SONG, Hongmei ZHAO, Shuyang SHEN, Jianbing ZHAO, Sanyuan HOI, Steven C. H. LING, Haibin
author_sort	WANG, Wenguan
title	Learning unsupervised video object segmentation through visual attention
title_short	Learning unsupervised video object segmentation through visual attention
title_full	Learning unsupervised video object segmentation through visual attention
title_fullStr	Learning unsupervised video object segmentation through visual attention
title_full_unstemmed	Learning unsupervised video object segmentation through visual attention
title_sort	learning unsupervised video object segmentation through visual attention
publisher	Institutional Knowledge at Singapore Management University
publishDate	2019
url	https://ink.library.smu.edu.sg/sol_research/3162 https://ink.library.smu.edu.sg/context/sol_research/article/5120/viewcontent/UVOS_cvpr19_av.pdf
_version_	1770575313845092352

Learning unsupervised video object segmentation through visual attention

Similar Items