Paying attention to video object pattern understanding

This paper conducts a systematic study on the role of visual attention in video object pattern understanding. By elaborately annotating three popular video segmentation datasets (DAVIS) with dynamic eye-tracking data in the unsupervised video object segmentation (UVOS) setting. For the first time, w...

Full description

Saved in:

Bibliographic Details
Main Authors:	WANG, Wenguan, SHEN, Jianbing, LU, Xiankai, HOI, Steven C. H., LING, Haibin
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2021
Subjects:	Video object pattern understanding unsupervised video object segmentation top-down visual attention video salient object detection Databases and Information Systems Numerical Analysis and Scientific Computing
Online Access:	https://ink.library.smu.edu.sg/sis_research/6960 https://ink.library.smu.edu.sg/context/sis_research/article/7963/viewcontent/08957473_av.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-7963
record_format	dspace
spelling	sg-smu-ink.sis_research-79632022-03-04T05:57:53Z Paying attention to video object pattern understanding WANG, Wenguan SHEN, Jianbing LU, Xiankai HOI, Steven C. H. LING, Haibin This paper conducts a systematic study on the role of visual attention in video object pattern understanding. By elaborately annotating three popular video segmentation datasets (DAVIS) with dynamic eye-tracking data in the unsupervised video object segmentation (UVOS) setting. For the first time, we quantitatively verified the high consistency of visual attention behavior among human observers, and found strong correlation between human attention and explicit primary object judgments during dynamic, task-driven viewing. Such novel observations provide an in-depth insight of the underlying rationale behind video object pattens. Inspired by these findings, we decouple UVOS into two sub-tasks: UVOS-driven Dynamic Visual Attention Prediction (DVAP) in spatiotemporal domain, and Attention-Guided Object Segmentation (AGOS) in spatial domain. Our UVOS solution enjoys three major advantages: 1) modular training without using expensive video segmentation annotations, instead, using more affordable dynamic fixation data to train the initial video attention module and using existing fixation-segmentation paired static/image data to train the subsequent segmentation module; 2) comprehensive foreground understanding through multi-source learning; and 3) additional interpretability from the biologically-inspired and assessable attention. Experiments on four popular benchmarks show that, even without using expensive video object mask annotations, our model achieves compelling performance compared with state-of-the-arts and enjoys fast processing speed (10 fps on a single GPU). Our collected eye-tracking data and algorithm implementations have been made publicly available athttps://github.com/wenguanwang/AGS. 2021-07-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/6960 info:doi/10.1109/TPAMI.2020.2966453 https://ink.library.smu.edu.sg/context/sis_research/article/7963/viewcontent/08957473_av.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Video object pattern understanding unsupervised video object segmentation top-down visual attention video salient object detection Databases and Information Systems Numerical Analysis and Scientific Computing
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Video object pattern understanding unsupervised video object segmentation top-down visual attention video salient object detection Databases and Information Systems Numerical Analysis and Scientific Computing
spellingShingle	Video object pattern understanding unsupervised video object segmentation top-down visual attention video salient object detection Databases and Information Systems Numerical Analysis and Scientific Computing WANG, Wenguan SHEN, Jianbing LU, Xiankai HOI, Steven C. H. LING, Haibin Paying attention to video object pattern understanding
description	This paper conducts a systematic study on the role of visual attention in video object pattern understanding. By elaborately annotating three popular video segmentation datasets (DAVIS) with dynamic eye-tracking data in the unsupervised video object segmentation (UVOS) setting. For the first time, we quantitatively verified the high consistency of visual attention behavior among human observers, and found strong correlation between human attention and explicit primary object judgments during dynamic, task-driven viewing. Such novel observations provide an in-depth insight of the underlying rationale behind video object pattens. Inspired by these findings, we decouple UVOS into two sub-tasks: UVOS-driven Dynamic Visual Attention Prediction (DVAP) in spatiotemporal domain, and Attention-Guided Object Segmentation (AGOS) in spatial domain. Our UVOS solution enjoys three major advantages: 1) modular training without using expensive video segmentation annotations, instead, using more affordable dynamic fixation data to train the initial video attention module and using existing fixation-segmentation paired static/image data to train the subsequent segmentation module; 2) comprehensive foreground understanding through multi-source learning; and 3) additional interpretability from the biologically-inspired and assessable attention. Experiments on four popular benchmarks show that, even without using expensive video object mask annotations, our model achieves compelling performance compared with state-of-the-arts and enjoys fast processing speed (10 fps on a single GPU). Our collected eye-tracking data and algorithm implementations have been made publicly available athttps://github.com/wenguanwang/AGS.
format	text
author	WANG, Wenguan SHEN, Jianbing LU, Xiankai HOI, Steven C. H. LING, Haibin
author_facet	WANG, Wenguan SHEN, Jianbing LU, Xiankai HOI, Steven C. H. LING, Haibin
author_sort	WANG, Wenguan
title	Paying attention to video object pattern understanding
title_short	Paying attention to video object pattern understanding
title_full	Paying attention to video object pattern understanding
title_fullStr	Paying attention to video object pattern understanding
title_full_unstemmed	Paying attention to video object pattern understanding
title_sort	paying attention to video object pattern understanding
publisher	Institutional Knowledge at Singapore Management University
publishDate	2021
url	https://ink.library.smu.edu.sg/sis_research/6960 https://ink.library.smu.edu.sg/context/sis_research/article/7963/viewcontent/08957473_av.pdf
_version_	1770576166436995072

Paying attention to video object pattern understanding

Similar Items