Detecting and recognizing human action in videos
Detecting and recognizing human actions is of great importance to video analytics due to its numerous applications in video surveillance and human computer interaction. Despite much previous work, fast and reliable action detection and recognition in unconstrained videos remain a challenging problem...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Theses and Dissertations |
Language: | English |
Published: |
2014
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/61739 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-61739 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-617392023-07-04T16:06:16Z Detecting and recognizing human action in videos Yu, Gang Yuan Junsong School of Electrical and Electronic Engineering DRNTU::Engineering::Electrical and electronic engineering Detecting and recognizing human actions is of great importance to video analytics due to its numerous applications in video surveillance and human computer interaction. Despite much previous work, fast and reliable action detection and recognition in unconstrained videos remain a challenging problem. First of all, actions are spatio-temporal patterns characterized by both motion and appearance features. The same type of action may exhibit large variations due to the changes of motion speed, scale, view point, clothing, not to mention partial occlusions. It is thus a challenge to perform robust action matching that is insensitive to such variations, especially if only a limited number of training examples are provided. Moreover, fast action detection and localization is another challenging issue in cluttered and dynamic environment. Compared with image based object detection which only requires spatial localization, action localization is in spatio-temporal video space thus is much more time consuming. This thesis presents a systematic study on detecting and recognizing human actions in cluttered and dynamic environments. The videos are characterized by spatio-temporal local features, and the proposed methods leverage the fast matching of local features to perform action recognition and detection. To capture the intra-class variations of action categories, randomized trees are developed to capture the local feature distribution of the action categories. Such a tree-based indexing enables fast local feature matching, and when limited training examples are available, it can be easily extended to index both labelled and unlabelled data samples and perform semi-supervised learning to improve the detection performance. Even with only one exemplar query action, the randomized tree indexing approach can still achieve promising result to detect similar actions in the big video corpus efficiently. To perform fast spatio-temporal action localization, two different approaches have been proposed: (1) Coarse-to-fine branch-and-bound search and (2) Propagative Hough voting. Both methods can significantly reduce the computational cost of action localization, and do not rely on human detection, tracking, and background subtraction. By addressing the fundamental challenges of action detection and recognition, this thesis also investigated action detection solutions for different application scenarios, such as multi-class action detection, action search with one query example, and online action prediction based on partial video observation. Extensive experiments on benchmarked datasets show that the proposed methods can achieve promising results compared with the state of the arts. DOCTOR OF PHILOSOPHY (EEE) 2014-09-12T02:01:39Z 2014-09-12T02:01:39Z 2014 2014 Thesis Yu, G. (2014). Detecting and recognizing human action in videos. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/61739 10.32657/10356/61739 en 140 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
DRNTU::Engineering::Electrical and electronic engineering |
spellingShingle |
DRNTU::Engineering::Electrical and electronic engineering Yu, Gang Detecting and recognizing human action in videos |
description |
Detecting and recognizing human actions is of great importance to video analytics due to its numerous applications in video surveillance and human computer interaction. Despite much previous work, fast and reliable action detection and recognition in unconstrained videos remain a challenging problem. First of all, actions are spatio-temporal patterns characterized by both motion and appearance features. The same type of action may exhibit large variations due to the changes of motion speed, scale, view point, clothing, not to mention partial occlusions. It is thus a challenge to perform robust action matching that is insensitive to such variations, especially if only a limited number of training examples are provided. Moreover, fast action detection and localization is another challenging issue in cluttered and dynamic environment. Compared with image based object detection which only requires spatial localization, action localization is in spatio-temporal video space thus is much more time consuming.
This thesis presents a systematic study on detecting and recognizing human actions in cluttered and dynamic environments. The videos are characterized by spatio-temporal local features, and the proposed methods leverage the fast matching of local features to perform action recognition and detection. To capture the intra-class variations of action categories, randomized trees are developed to capture the local feature distribution of the action categories. Such a tree-based indexing enables fast local feature matching, and when limited training examples are available, it can be easily extended to index both labelled and unlabelled data samples and perform semi-supervised learning to improve the detection performance. Even with only one exemplar query action, the randomized tree indexing approach can still achieve promising result to detect similar actions in the big video corpus efficiently. To perform fast spatio-temporal action localization, two different approaches have been proposed: (1) Coarse-to-fine branch-and-bound search and (2) Propagative Hough voting. Both methods can significantly reduce the computational cost of action localization, and do not rely on human detection, tracking, and background subtraction.
By addressing the fundamental challenges of action detection and recognition, this thesis also investigated action detection solutions for different application scenarios, such as multi-class action detection, action search with one query example, and online action prediction based on partial video observation. Extensive experiments on benchmarked datasets show that the proposed methods can achieve promising results compared with the state of the arts. |
author2 |
Yuan Junsong |
author_facet |
Yuan Junsong Yu, Gang |
format |
Theses and Dissertations |
author |
Yu, Gang |
author_sort |
Yu, Gang |
title |
Detecting and recognizing human action in videos |
title_short |
Detecting and recognizing human action in videos |
title_full |
Detecting and recognizing human action in videos |
title_fullStr |
Detecting and recognizing human action in videos |
title_full_unstemmed |
Detecting and recognizing human action in videos |
title_sort |
detecting and recognizing human action in videos |
publishDate |
2014 |
url |
https://hdl.handle.net/10356/61739 |
_version_ |
1772825816274567168 |