Detecting and recognizing human action in videos

Detecting and recognizing human actions is of great importance to video analytics due to its numerous applications in video surveillance and human computer interaction. Despite much previous work, fast and reliable action detection and recognition in unconstrained videos remain a challenging problem...

Full description

Saved in:
Bibliographic Details
Main Author: Yu, Gang
Other Authors: Yuan Junsong
Format: Theses and Dissertations
Language:English
Published: 2014
Subjects:
Online Access:https://hdl.handle.net/10356/61739
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-61739
record_format dspace
spelling sg-ntu-dr.10356-617392023-07-04T16:06:16Z Detecting and recognizing human action in videos Yu, Gang Yuan Junsong School of Electrical and Electronic Engineering DRNTU::Engineering::Electrical and electronic engineering Detecting and recognizing human actions is of great importance to video analytics due to its numerous applications in video surveillance and human computer interaction. Despite much previous work, fast and reliable action detection and recognition in unconstrained videos remain a challenging problem. First of all, actions are spatio-temporal patterns characterized by both motion and appearance features. The same type of action may exhibit large variations due to the changes of motion speed, scale, view point, clothing, not to mention partial occlusions. It is thus a challenge to perform robust action matching that is insensitive to such variations, especially if only a limited number of training examples are provided. Moreover, fast action detection and localization is another challenging issue in cluttered and dynamic environment. Compared with image based object detection which only requires spatial localization, action localization is in spatio-temporal video space thus is much more time consuming. This thesis presents a systematic study on detecting and recognizing human actions in cluttered and dynamic environments. The videos are characterized by spatio-temporal local features, and the proposed methods leverage the fast matching of local features to perform action recognition and detection. To capture the intra-class variations of action categories, randomized trees are developed to capture the local feature distribution of the action categories. Such a tree-based indexing enables fast local feature matching, and when limited training examples are available, it can be easily extended to index both labelled and unlabelled data samples and perform semi-supervised learning to improve the detection performance. Even with only one exemplar query action, the randomized tree indexing approach can still achieve promising result to detect similar actions in the big video corpus efficiently. To perform fast spatio-temporal action localization, two different approaches have been proposed: (1) Coarse-to-fine branch-and-bound search and (2) Propagative Hough voting. Both methods can significantly reduce the computational cost of action localization, and do not rely on human detection, tracking, and background subtraction. By addressing the fundamental challenges of action detection and recognition, this thesis also investigated action detection solutions for different application scenarios, such as multi-class action detection, action search with one query example, and online action prediction based on partial video observation. Extensive experiments on benchmarked datasets show that the proposed methods can achieve promising results compared with the state of the arts. DOCTOR OF PHILOSOPHY (EEE) 2014-09-12T02:01:39Z 2014-09-12T02:01:39Z 2014 2014 Thesis Yu, G. (2014). Detecting and recognizing human action in videos. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/61739 10.32657/10356/61739 en 140 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Electrical and electronic engineering
spellingShingle DRNTU::Engineering::Electrical and electronic engineering
Yu, Gang
Detecting and recognizing human action in videos
description Detecting and recognizing human actions is of great importance to video analytics due to its numerous applications in video surveillance and human computer interaction. Despite much previous work, fast and reliable action detection and recognition in unconstrained videos remain a challenging problem. First of all, actions are spatio-temporal patterns characterized by both motion and appearance features. The same type of action may exhibit large variations due to the changes of motion speed, scale, view point, clothing, not to mention partial occlusions. It is thus a challenge to perform robust action matching that is insensitive to such variations, especially if only a limited number of training examples are provided. Moreover, fast action detection and localization is another challenging issue in cluttered and dynamic environment. Compared with image based object detection which only requires spatial localization, action localization is in spatio-temporal video space thus is much more time consuming. This thesis presents a systematic study on detecting and recognizing human actions in cluttered and dynamic environments. The videos are characterized by spatio-temporal local features, and the proposed methods leverage the fast matching of local features to perform action recognition and detection. To capture the intra-class variations of action categories, randomized trees are developed to capture the local feature distribution of the action categories. Such a tree-based indexing enables fast local feature matching, and when limited training examples are available, it can be easily extended to index both labelled and unlabelled data samples and perform semi-supervised learning to improve the detection performance. Even with only one exemplar query action, the randomized tree indexing approach can still achieve promising result to detect similar actions in the big video corpus efficiently. To perform fast spatio-temporal action localization, two different approaches have been proposed: (1) Coarse-to-fine branch-and-bound search and (2) Propagative Hough voting. Both methods can significantly reduce the computational cost of action localization, and do not rely on human detection, tracking, and background subtraction. By addressing the fundamental challenges of action detection and recognition, this thesis also investigated action detection solutions for different application scenarios, such as multi-class action detection, action search with one query example, and online action prediction based on partial video observation. Extensive experiments on benchmarked datasets show that the proposed methods can achieve promising results compared with the state of the arts.
author2 Yuan Junsong
author_facet Yuan Junsong
Yu, Gang
format Theses and Dissertations
author Yu, Gang
author_sort Yu, Gang
title Detecting and recognizing human action in videos
title_short Detecting and recognizing human action in videos
title_full Detecting and recognizing human action in videos
title_fullStr Detecting and recognizing human action in videos
title_full_unstemmed Detecting and recognizing human action in videos
title_sort detecting and recognizing human action in videos
publishDate 2014
url https://hdl.handle.net/10356/61739
_version_ 1772825816274567168