Video event detection using motion relativity and feature selection

Event detection plays an essential role in video content analysis. In this paper, we present our approach based on motion relativity and feature selection for video event detection. First, we propose a new motion feature, namely Expanded Relative Motion Histogram of Bag-of-Visual-Words (ERMH-BoW) to...

Full description

Saved in:
Bibliographic Details
Main Authors: WANG, Feng, SUN, Zhanhu, JIANG, Yu-Gang, NGO, Chong-wah
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2014
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/6349
https://ink.library.smu.edu.sg/context/sis_research/article/7352/viewcontent/tmm14_fwang.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:Event detection plays an essential role in video content analysis. In this paper, we present our approach based on motion relativity and feature selection for video event detection. First, we propose a new motion feature, namely Expanded Relative Motion Histogram of Bag-of-Visual-Words (ERMH-BoW) to employ motion relativity for event detection. In ERMH-BoW, by representing what aspect of an event with Bag-of-Visual-Words (BoW), we construct relative motion histograms between different visual words to depict the objects' activities or how aspect of the event. ERMH-BoW thus integrates both what and how aspects for a complete event description. Meanwhile, we show that by employing motion relativity, ERMH-BoW is invariant to the varying camera movement and able to honestly describe the object activities in an event. Furthermore, compared with other motion features, ERMH-BoW encodes not only the motion of objects, but also the interactions between different objects/scenes. Second, to address the high-dimensionality problem of the ERMH-BoW feature, we further propose an approach based on information gain and informativeness weighting to select a cleaner and more discriminative set of features. Our experiments carried out on several challenging datasets provided by TRECVID for the MED (Multimedia Event Detection) task demonstrate that our proposed approach outperforms the state-of-the-art approaches for video event detection.