Event detection with zero example: Select the right and suppress the wrong concepts

Complex video event detection without visual examples is a very challenging issue in multimedia retrieval. We present a state-of-the-art framework for event search without any need of exemplar videos and textual metadata in search corpus. To perform event search given only query words, the core of o...

Full description

Saved in:
Bibliographic Details
Main Authors: LU, Yi-Jie, ZHANG, Hao, DE BOER, Maaike, NGO, Chong-wah
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2016
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/6440
https://ink.library.smu.edu.sg/context/sis_research/article/7443/viewcontent/2911996.2912015.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:Complex video event detection without visual examples is a very challenging issue in multimedia retrieval. We present a state-of-the-art framework for event search without any need of exemplar videos and textual metadata in search corpus. To perform event search given only query words, the core of our framework is a large, pre-built bank of concept detectors which can understand the content of a video in the perspective of object, scene, action and activity concepts. Leveraging such knowledge can effectively narrow the semantic gap between textual query and the visual content of videos. Besides the large concept bank, this paper focuses on two challenges that largely affect the retrieval performance when the size of the concept bank increases: (1) How to choose the right concepts in the concept bank to accurately represent the query; (2) if noisy concepts are inevitably chosen, how to minimize their influence. We share our novel insights on these particular problems, which paves the way for a practical system that achieves the best performance in NIST TRECVID 2015.