REAL-TIME HUMAN MOVEMENT TRACKING WITH VISUAL MULTI-FEATURES
Various methods are employed in computer vision applications to identify individuals, including the utilization of facial recognition as a useful human visual feature for tracking or locating individuals. Nevertheless, there are limitations in tracking systems that solely rely on facial informati...
Saved in:
Main Author: | |
---|---|
Format: | Dissertations |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/79659 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
id |
id-itb.:79659 |
---|---|
spelling |
id-itb.:796592024-01-15T07:58:13ZREAL-TIME HUMAN MOVEMENT TRACKING WITH VISUAL MULTI-FEATURES Anggi Maharani, Devira Indonesia Dissertations Q-Learning, FOAPF, Deep-KCF, CNN, CNN+LSTM. INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/79659 Various methods are employed in computer vision applications to identify individuals, including the utilization of facial recognition as a useful human visual feature for tracking or locating individuals. Nevertheless, there are limitations in tracking systems that solely rely on facial information, particularly when facing challenges such as occlusion, blurry images, or faces turned away from the camera. Under such circumstances, tracking systems struggle with accurate facial recognition. Hence, in this study, in addition to facial visual features, descriptions of other body visual features are incorporated to address this issue. In situations where the sought-after face cannot be found by the system, a hybrid method combining Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) can aid in multi-feature body visual recognition, narrowing down the search space and expediting the process. Research findings demonstrate that the CNN+LSTM method combination yields higher accuracy, recall, precision, and F1 score values (reaching 89.20%, 87.36%, 91.02%, and 88.43%, respectively) compared to the single CNN method (reaching 88.84%, 74.00%, 67.00%, and 69.00%, respectively). However, the computational demands of these two visual features combined are high, thus requiring a tracking system that can reduce computational load and predict target location. When the sought-after object has been identified by the multi-feature visual system, this tracker can sustain object position information from frame to frame, eliminating the need to identify the object in each frame and significantly saving computational time. The tracking system employing the Firefly Optimization Algorithm-based Particle Filter (FOAPF) and the Deep Kernelized Correlation Filters (KCF) deep feature- based methods has exhibited improved object tracking accuracy across various image environments. The FOAPF method achieves an error rate of 8.80 pixels with a distribution of 50 particles, achieving the smallest error in low-resolution videos and simple background images. This method is suitable for use in environments with lower complexity. Meanwhile, the KCF method with deep feature-based ix transfer learning is more effective in dealing with complex background image environments and yields an error rate of 10.08 pixels. In this study, the choice of tracking method takes into account the confidence scores generated by the face and body detection systems to operate in real-time. Furthermore, the Q-Learning algorithm is utilized to make optimal decisions in automatically tracking objects in dynamic environments. The system considers multiple factors such as facial and body visual features, object location, and environmental conditions to make the best decisions. The goal is to enhance the efficiency and accuracy of the tracked object. Based on the experiments conducted, it is concluded that the system can adapt its actions in response to environmental changes, resulting in better outcomes. This is evidenced by an accuracy rate of 91.5% and an average of 50 fps across 5 different videos, as well as a benchmark video dataset with an accuracy of 84% and an average error of 11.15 pixels. These results indicate a very high level of accuracy in real-time human movement tracking using the proposed method. text |
institution |
Institut Teknologi Bandung |
building |
Institut Teknologi Bandung Library |
continent |
Asia |
country |
Indonesia Indonesia |
content_provider |
Institut Teknologi Bandung |
collection |
Digital ITB |
language |
Indonesia |
description |
Various methods are employed in computer vision applications to identify
individuals, including the utilization of facial recognition as a useful human visual
feature for tracking or locating individuals. Nevertheless, there are limitations in
tracking systems that solely rely on facial information, particularly when facing
challenges such as occlusion, blurry images, or faces turned away from the camera.
Under such circumstances, tracking systems struggle with accurate facial
recognition.
Hence, in this study, in addition to facial visual features, descriptions of other body
visual features are incorporated to address this issue. In situations where the
sought-after face cannot be found by the system, a hybrid method combining
Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) can
aid in multi-feature body visual recognition, narrowing down the search space and
expediting the process.
Research findings demonstrate that the CNN+LSTM method combination yields
higher accuracy, recall, precision, and F1 score values (reaching 89.20%, 87.36%,
91.02%, and 88.43%, respectively) compared to the single CNN method (reaching
88.84%, 74.00%, 67.00%, and 69.00%, respectively). However, the computational
demands of these two visual features combined are high, thus requiring a tracking
system that can reduce computational load and predict target location. When the
sought-after object has been identified by the multi-feature visual system, this
tracker can sustain object position information from frame to frame, eliminating
the need to identify the object in each frame and significantly saving computational
time.
The tracking system employing the Firefly Optimization Algorithm-based Particle
Filter (FOAPF) and the Deep Kernelized Correlation Filters (KCF) deep feature-
based methods has exhibited improved object tracking accuracy across various
image environments. The FOAPF method achieves an error rate of 8.80 pixels with
a distribution of 50 particles, achieving the smallest error in low-resolution videos
and simple background images. This method is suitable for use in environments
with lower complexity. Meanwhile, the KCF method with deep feature-based
ix
transfer learning is more effective in dealing with complex background image
environments and yields an error rate of 10.08 pixels. In this study, the choice of
tracking method takes into account the confidence scores generated by the face and
body detection systems to operate in real-time.
Furthermore, the Q-Learning algorithm is utilized to make optimal decisions in
automatically tracking objects in dynamic environments. The system considers
multiple factors such as facial and body visual features, object location, and
environmental conditions to make the best decisions. The goal is to enhance the
efficiency and accuracy of the tracked object. Based on the experiments conducted,
it is concluded that the system can adapt its actions in response to environmental
changes, resulting in better outcomes. This is evidenced by an accuracy rate of
91.5% and an average of 50 fps across 5 different videos, as well as a benchmark
video dataset with an accuracy of 84% and an average error of 11.15 pixels. These
results indicate a very high level of accuracy in real-time human movement tracking
using the proposed method. |
format |
Dissertations |
author |
Anggi Maharani, Devira |
spellingShingle |
Anggi Maharani, Devira REAL-TIME HUMAN MOVEMENT TRACKING WITH VISUAL MULTI-FEATURES |
author_facet |
Anggi Maharani, Devira |
author_sort |
Anggi Maharani, Devira |
title |
REAL-TIME HUMAN MOVEMENT TRACKING WITH VISUAL MULTI-FEATURES |
title_short |
REAL-TIME HUMAN MOVEMENT TRACKING WITH VISUAL MULTI-FEATURES |
title_full |
REAL-TIME HUMAN MOVEMENT TRACKING WITH VISUAL MULTI-FEATURES |
title_fullStr |
REAL-TIME HUMAN MOVEMENT TRACKING WITH VISUAL MULTI-FEATURES |
title_full_unstemmed |
REAL-TIME HUMAN MOVEMENT TRACKING WITH VISUAL MULTI-FEATURES |
title_sort |
real-time human movement tracking with visual multi-features |
url |
https://digilib.itb.ac.id/gdl/view/79659 |
_version_ |
1822996416873627648 |