EPIC-KITCHENS VISOR benchmark: Video segmentations and object relations

We introduce VISOR, a new dataset of pixel annotations and a benchmark suite for segmenting hands and active objects in egocentric video. VISOR annotates videos from EPIC-KITCHENS, which comes with a new set of challenges not encountered in current video segmentation datasets. Specifically, we need...

Full description

Saved in:

Bibliographic Details
Main Authors:	DAR KHALIL, Ahmad AK, SHAN, Dandan, ZHU, Bin, MA, Jian, KAR, Amlan, HIGGINS, Richard, FOUHEY, David, FIDLER, Sanja, DAMEN, Dima
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2022
Subjects:	Graphics and Human Computer Interfaces
Online Access:	https://ink.library.smu.edu.sg/sis_research/9013 https://ink.library.smu.edu.sg/context/sis_research/article/10016/viewcontent/98_epic_kitchens_visor_benchmark_.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

Description
Summary:	We introduce VISOR, a new dataset of pixel annotations and a benchmark suite for segmenting hands and active objects in egocentric video. VISOR annotates videos from EPIC-KITCHENS, which comes with a new set of challenges not encountered in current video segmentation datasets. Specifically, we need to ensure both short- and long-term consistency of pixel-level annotations as objects undergo transformative interactions, e.g. an onion is peeled, diced and cooked - where we aim to obtain accurate pixel-level annotations of the peel, onion pieces, chopping board, knife, pan, as well as the acting hands. VISOR introduces an annotation pipeline, AI-powered in parts, for scalability and quality. In total, we publiclyrelease 272K manual semantic masks of 257 object classes, 9.9M interpolated dense masks, 67K hand-object relations, covering 36 hours of 179 untrimmed videos. Along with the annotations, we introduce three challenges in video object segmentation, interaction understanding and long-term reasoning.For data, code and leaderboards: http://epic-kitchens.github.io/VISOR

EPIC-KITCHENS VISOR benchmark: Video segmentations and object relations

Similar Items