Human motion tracking using deep learning
Tracking moving objects like pedestrian have a wide range of applications for intelligent autonomous vehicle. While object detection using modern deep learning-based approach such as Convolutional Neural Network (CNN) have advanced significantly, autonomous system such as ground mobile robots still...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2020
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/139214 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-139214 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1392142023-07-07T18:53:28Z Human motion tracking using deep learning Peok, Qing Xiang Soong Boon Hee School of Electrical and Electronic Engineering EBHSOONG@ntu.edu.sg Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Engineering::Electrical and electronic engineering Tracking moving objects like pedestrian have a wide range of applications for intelligent autonomous vehicle. While object detection using modern deep learning-based approach such as Convolutional Neural Network (CNN) have advanced significantly, autonomous system such as ground mobile robots still face problems in real-time implementation. These methods include CNN based human tracking due to occlusions, illumination changes and camera view variations. The aim of this project is to identify the optimal deep learning algorithm, in terms of speed and accuracy. Human tracking by detection using selected real-time object detector is also implemented. Firstly, in terms of performance speed, in terms of Frames per Second (FPS), One-stage Detectors, Single Shot Detector (SSD) achieve 19 FPS, You Only Look Once (YOLOv3) achieve 22 FPS as compared to Two-stage Detectors, such as Mask Region-Convolutional Neural Network (Mask R-CNN) which achieve 1 FPS, with this, it is concluded that One-stage Detectors is more suitable for real time implementation. In addition, Multiple Object Tracking Precision (MOTP) is used as a measure for detection accuracy, it is found that SSD achieved 68.4%, YOLOv3 achieved 72.9% and Mask R-CNN achieved 74.0%. Thus, YOLOv3 is used in this project as it is the optimal object detector. Next, the detection output coordinates as bounding boxes (in pixels) from the YOLOv3 is passed to the tracking algorithm that associate detections over frames to create a trajectory of any tracked person. The decision rule to associate detection over frames is computed by Hungarian Algorithm (Kuhn-Munkres) that takes on motion affinity and appearance similarity as inputs. For motion affinity, Intersection Over Union (IOU) provides information on physical proximity between the new detections and the tracked persons. The appearance similarity is measured by computing Euclidean distance between the last CNN features of the detected person. Promising results were obtained in the various tests carried out. Finally, the tracking algorithm is tested on mobile robot with Robot Operating System (ROS) framework, with images captured from Orbbec Astra RGB-D camera which is passed into object detection algorithm. The tracking is performed subsequently in real-time. Two major challenges faced during testing were ID switching and missing detections. While the latter is limited by the detection algorithm, ID switch occurs when two similar looking persons are very close and tracking algorithm are unable to differentiate them. Bachelor of Engineering (Electrical and Electronic Engineering) 2020-05-18T05:29:52Z 2020-05-18T05:29:52Z 2020 Final Year Project (FYP) https://hdl.handle.net/10356/139214 en A3231-191 application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Engineering::Electrical and electronic engineering |
spellingShingle |
Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Engineering::Electrical and electronic engineering Peok, Qing Xiang Human motion tracking using deep learning |
description |
Tracking moving objects like pedestrian have a wide range of applications for intelligent autonomous vehicle. While object detection using modern deep learning-based approach such as Convolutional Neural Network (CNN) have advanced significantly, autonomous system such as ground mobile robots still face problems in real-time implementation. These methods include CNN based human tracking due to occlusions, illumination changes and camera view variations. The aim of this project is to identify the optimal deep learning algorithm, in terms of speed and accuracy. Human tracking by detection using selected real-time object detector is also implemented. Firstly, in terms of performance speed, in terms of Frames per Second (FPS), One-stage Detectors, Single Shot Detector (SSD) achieve 19 FPS, You Only Look Once (YOLOv3) achieve 22 FPS as compared to Two-stage Detectors, such as Mask Region-Convolutional Neural Network (Mask R-CNN) which achieve 1 FPS, with this, it is concluded that One-stage Detectors is more suitable for real time implementation. In addition, Multiple Object Tracking Precision (MOTP) is used as a measure for detection accuracy, it is found that SSD achieved 68.4%, YOLOv3 achieved 72.9% and Mask R-CNN achieved 74.0%. Thus, YOLOv3 is used in this project as it is the optimal object detector. Next, the detection output coordinates as bounding boxes (in pixels) from the YOLOv3 is passed to the tracking algorithm that associate detections over frames to create a trajectory of any tracked person. The decision rule to associate detection over frames is computed by Hungarian Algorithm (Kuhn-Munkres) that takes on motion affinity and appearance similarity as inputs. For motion affinity, Intersection Over Union (IOU) provides information on physical proximity between the new detections and the tracked persons. The appearance similarity is measured by computing Euclidean distance between the last CNN features of the detected person. Promising results were obtained in the various tests carried out. Finally, the tracking algorithm is tested on mobile robot with Robot Operating System (ROS) framework, with images captured from Orbbec Astra RGB-D camera which is passed into object detection algorithm. The tracking is performed subsequently in real-time. Two major challenges faced during testing were ID switching and missing detections. While the latter is limited by the detection algorithm, ID switch occurs when two similar looking persons are very close and tracking algorithm are unable to differentiate them. |
author2 |
Soong Boon Hee |
author_facet |
Soong Boon Hee Peok, Qing Xiang |
format |
Final Year Project |
author |
Peok, Qing Xiang |
author_sort |
Peok, Qing Xiang |
title |
Human motion tracking using deep learning |
title_short |
Human motion tracking using deep learning |
title_full |
Human motion tracking using deep learning |
title_fullStr |
Human motion tracking using deep learning |
title_full_unstemmed |
Human motion tracking using deep learning |
title_sort |
human motion tracking using deep learning |
publisher |
Nanyang Technological University |
publishDate |
2020 |
url |
https://hdl.handle.net/10356/139214 |
_version_ |
1772827283888799744 |