REAL-TIME HUMAN MOVEMENT TRACKING WITH VISUAL MULTI-FEATURES

Various methods are employed in computer vision applications to identify individuals, including the utilization of facial recognition as a useful human visual feature for tracking or locating individuals. Nevertheless, there are limitations in tracking systems that solely rely on facial informati...

Full description

Saved in:
Bibliographic Details
Main Author: Anggi Maharani, Devira
Format: Dissertations
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/79659
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
id id-itb.:79659
spelling id-itb.:796592024-01-15T07:58:13ZREAL-TIME HUMAN MOVEMENT TRACKING WITH VISUAL MULTI-FEATURES Anggi Maharani, Devira Indonesia Dissertations Q-Learning, FOAPF, Deep-KCF, CNN, CNN+LSTM. INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/79659 Various methods are employed in computer vision applications to identify individuals, including the utilization of facial recognition as a useful human visual feature for tracking or locating individuals. Nevertheless, there are limitations in tracking systems that solely rely on facial information, particularly when facing challenges such as occlusion, blurry images, or faces turned away from the camera. Under such circumstances, tracking systems struggle with accurate facial recognition. Hence, in this study, in addition to facial visual features, descriptions of other body visual features are incorporated to address this issue. In situations where the sought-after face cannot be found by the system, a hybrid method combining Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) can aid in multi-feature body visual recognition, narrowing down the search space and expediting the process. Research findings demonstrate that the CNN+LSTM method combination yields higher accuracy, recall, precision, and F1 score values (reaching 89.20%, 87.36%, 91.02%, and 88.43%, respectively) compared to the single CNN method (reaching 88.84%, 74.00%, 67.00%, and 69.00%, respectively). However, the computational demands of these two visual features combined are high, thus requiring a tracking system that can reduce computational load and predict target location. When the sought-after object has been identified by the multi-feature visual system, this tracker can sustain object position information from frame to frame, eliminating the need to identify the object in each frame and significantly saving computational time. The tracking system employing the Firefly Optimization Algorithm-based Particle Filter (FOAPF) and the Deep Kernelized Correlation Filters (KCF) deep feature- based methods has exhibited improved object tracking accuracy across various image environments. The FOAPF method achieves an error rate of 8.80 pixels with a distribution of 50 particles, achieving the smallest error in low-resolution videos and simple background images. This method is suitable for use in environments with lower complexity. Meanwhile, the KCF method with deep feature-based ix transfer learning is more effective in dealing with complex background image environments and yields an error rate of 10.08 pixels. In this study, the choice of tracking method takes into account the confidence scores generated by the face and body detection systems to operate in real-time. Furthermore, the Q-Learning algorithm is utilized to make optimal decisions in automatically tracking objects in dynamic environments. The system considers multiple factors such as facial and body visual features, object location, and environmental conditions to make the best decisions. The goal is to enhance the efficiency and accuracy of the tracked object. Based on the experiments conducted, it is concluded that the system can adapt its actions in response to environmental changes, resulting in better outcomes. This is evidenced by an accuracy rate of 91.5% and an average of 50 fps across 5 different videos, as well as a benchmark video dataset with an accuracy of 84% and an average error of 11.15 pixels. These results indicate a very high level of accuracy in real-time human movement tracking using the proposed method. text
institution Institut Teknologi Bandung
building Institut Teknologi Bandung Library
continent Asia
country Indonesia
Indonesia
content_provider Institut Teknologi Bandung
collection Digital ITB
language Indonesia
description Various methods are employed in computer vision applications to identify individuals, including the utilization of facial recognition as a useful human visual feature for tracking or locating individuals. Nevertheless, there are limitations in tracking systems that solely rely on facial information, particularly when facing challenges such as occlusion, blurry images, or faces turned away from the camera. Under such circumstances, tracking systems struggle with accurate facial recognition. Hence, in this study, in addition to facial visual features, descriptions of other body visual features are incorporated to address this issue. In situations where the sought-after face cannot be found by the system, a hybrid method combining Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) can aid in multi-feature body visual recognition, narrowing down the search space and expediting the process. Research findings demonstrate that the CNN+LSTM method combination yields higher accuracy, recall, precision, and F1 score values (reaching 89.20%, 87.36%, 91.02%, and 88.43%, respectively) compared to the single CNN method (reaching 88.84%, 74.00%, 67.00%, and 69.00%, respectively). However, the computational demands of these two visual features combined are high, thus requiring a tracking system that can reduce computational load and predict target location. When the sought-after object has been identified by the multi-feature visual system, this tracker can sustain object position information from frame to frame, eliminating the need to identify the object in each frame and significantly saving computational time. The tracking system employing the Firefly Optimization Algorithm-based Particle Filter (FOAPF) and the Deep Kernelized Correlation Filters (KCF) deep feature- based methods has exhibited improved object tracking accuracy across various image environments. The FOAPF method achieves an error rate of 8.80 pixels with a distribution of 50 particles, achieving the smallest error in low-resolution videos and simple background images. This method is suitable for use in environments with lower complexity. Meanwhile, the KCF method with deep feature-based ix transfer learning is more effective in dealing with complex background image environments and yields an error rate of 10.08 pixels. In this study, the choice of tracking method takes into account the confidence scores generated by the face and body detection systems to operate in real-time. Furthermore, the Q-Learning algorithm is utilized to make optimal decisions in automatically tracking objects in dynamic environments. The system considers multiple factors such as facial and body visual features, object location, and environmental conditions to make the best decisions. The goal is to enhance the efficiency and accuracy of the tracked object. Based on the experiments conducted, it is concluded that the system can adapt its actions in response to environmental changes, resulting in better outcomes. This is evidenced by an accuracy rate of 91.5% and an average of 50 fps across 5 different videos, as well as a benchmark video dataset with an accuracy of 84% and an average error of 11.15 pixels. These results indicate a very high level of accuracy in real-time human movement tracking using the proposed method.
format Dissertations
author Anggi Maharani, Devira
spellingShingle Anggi Maharani, Devira
REAL-TIME HUMAN MOVEMENT TRACKING WITH VISUAL MULTI-FEATURES
author_facet Anggi Maharani, Devira
author_sort Anggi Maharani, Devira
title REAL-TIME HUMAN MOVEMENT TRACKING WITH VISUAL MULTI-FEATURES
title_short REAL-TIME HUMAN MOVEMENT TRACKING WITH VISUAL MULTI-FEATURES
title_full REAL-TIME HUMAN MOVEMENT TRACKING WITH VISUAL MULTI-FEATURES
title_fullStr REAL-TIME HUMAN MOVEMENT TRACKING WITH VISUAL MULTI-FEATURES
title_full_unstemmed REAL-TIME HUMAN MOVEMENT TRACKING WITH VISUAL MULTI-FEATURES
title_sort real-time human movement tracking with visual multi-features
url https://digilib.itb.ac.id/gdl/view/79659
_version_ 1822996416873627648