Action classification by exploring directional co-occurrence of weighted STIPs

Human action recognition is challenging mainly due to intro-variety, inter-ambiguity and clutter backgrounds in real videos. Bag-of-visual words model utilizes spatio-temporal interest points(STIPs), and represents action by the distribution of points which ignores visual context among points. To ad...

Full description

Saved in:
Bibliographic Details
Main Authors: LIU, Mengyuan, LIU, Hong, SUN, Qianru
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2014
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/4463
https://ink.library.smu.edu.sg/context/sis_research/article/5466/viewcontent/ICIP2014_liumengyuan.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:Human action recognition is challenging mainly due to intro-variety, inter-ambiguity and clutter backgrounds in real videos. Bag-of-visual words model utilizes spatio-temporal interest points(STIPs), and represents action by the distribution of points which ignores visual context among points. To add more contextual information, we propose a method by encoding spatio-temporal distribution of weighted pairwise points. First, STIPs are extracted from an action sequence and clustered into visual words. Then, each word is weighted in both temporal and spatial domains to capture the relationships with other words. Finally, the directional relationships between co-occurrence pairwise words are used to encode visual contexts. We report state-of-the-art results on Rochester and UT-Interaction datasets to validate that our method can classify human actions with high accuracies.