Abnormal event detection in crowded scenes using sparse representation

We propose to detect abnormal events via a sparse reconstruction over the normal bases. Given a collection of normal training examples, e.g., an image sequence or a collection of local spatio-temporal patches, we propose the sparse reconstruction cost (SRC) over the normal dictionary to measure the...

Full description

Saved in:
Bibliographic Details
Main Authors: Cong, Yang, Yuan, Junsong, Liu, Ji
Other Authors: School of Electrical and Electronic Engineering
Format: Article
Language:English
Published: 2013
Subjects:
Online Access:https://hdl.handle.net/10356/107417
http://hdl.handle.net/10220/17702
http://dx.doi.org/10.1016/j.patcog.2012.11.021
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:We propose to detect abnormal events via a sparse reconstruction over the normal bases. Given a collection of normal training examples, e.g., an image sequence or a collection of local spatio-temporal patches, we propose the sparse reconstruction cost (SRC) over the normal dictionary to measure the normalness of the testing sample. By introducing the prior weight of each basis during sparse reconstruction, the proposed SRC is more robust compared to other outlier detection criteria. To condense the over-completed normal bases into a compact dictionary, a novel dictionary selection method with group sparsity constraint is designed, which can be solved by standard convex optimization. Observing that the group sparsity also implies a low rank structure, we reformulate the problem using matrix decomposition, which can handle large scale training samples by reducing the memory requirement at each iteration from O(k2) to O(k) where k is the number of samples. We use the columnwise coordinate descent to solve the matrix decomposition represented formulation, which empirically leads to a similar solution to the group sparsity formulation. By designing different types of spatio-temporal basis, our method can detect both local and global abnormal events. Meanwhile, as it does not rely on object detection and tracking, it can be applied to crowded video scenes. By updating the dictionary incrementally, our method can be easily extended to online event detection. Experiments on three benchmark datasets and the comparison to the state-of-the-art methods validate the advantages of our method.