DEEP LEARNING BASED HUMAN ACTION RECOGNITION AND SPATIOTEMPORAL LOCALIZATION SYSTEM
Spatiotemporal human action localization system is a field in computer vision and is of interest for real-world applications implemented in smart surveillance cameras, such as to improve public security, monitor patients' activities, or even detect any early symptoms of certain diseases....
Saved in:
Main Author: | |
---|---|
Format: | Final Project |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/56136 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
Summary: | Spatiotemporal human action localization system is a field in computer
vision and is of interest for real-world applications implemented in smart
surveillance cameras, such as to improve public security, monitor patients'
activities, or even detect any early symptoms of certain diseases.
The system presented in this thesis followed the YOWO machine learning
architecture reference, which was proposed by Köpüklü etc. (2019). YOWO
extracts both spatial and temporal information. Bounding box regression and action
classification can be done end-to-end. This aims to generate output faster compare
to other state-of-the-art approaches.
The implementation of this system is trained and tested with J-HMDB and
NTU RGB+D datasets. Using certain specifications of machine defined, the system
is just able to process video at 0.75seconds per frame with an accepted accuracy
value. However, the system succeeds in increasing the human action localization
accuracy from the YOWO reference with an accuracy of 41.6% to 43.84%.
The result of the experiments shows that the modified architecture is able to
improve the accuracy of YOWO. However, it slows down the frame rate of video
processing |
---|