DEEP LEARNING BASED HUMAN ACTION RECOGNITION AND SPATIOTEMPORAL LOCALIZATION SYSTEM

Spatiotemporal human action localization system is a field in computer vision and is of interest for real-world applications implemented in smart surveillance cameras, such as to improve public security, monitor patients' activities, or even detect any early symptoms of certain diseases....

Full description

Saved in:
Bibliographic Details
Main Author: Nathania, Jesslyn
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/56136
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:Spatiotemporal human action localization system is a field in computer vision and is of interest for real-world applications implemented in smart surveillance cameras, such as to improve public security, monitor patients' activities, or even detect any early symptoms of certain diseases. The system presented in this thesis followed the YOWO machine learning architecture reference, which was proposed by Köpüklü etc. (2019). YOWO extracts both spatial and temporal information. Bounding box regression and action classification can be done end-to-end. This aims to generate output faster compare to other state-of-the-art approaches. The implementation of this system is trained and tested with J-HMDB and NTU RGB+D datasets. Using certain specifications of machine defined, the system is just able to process video at 0.75seconds per frame with an accepted accuracy value. However, the system succeeds in increasing the human action localization accuracy from the YOWO reference with an accuracy of 41.6% to 43.84%. The result of the experiments shows that the modified architecture is able to improve the accuracy of YOWO. However, it slows down the frame rate of video processing