Human motion tracking using deep learning

Tracking moving objects like pedestrian have a wide range of applications for intelligent autonomous vehicle. While object detection using modern deep learning-based approach such as Convolutional Neural Network (CNN) have advanced significantly, autonomous system such as ground mobile robots still...

Full description

Saved in:
Bibliographic Details
Main Author: Peok, Qing Xiang
Other Authors: Soong Boon Hee
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2020
Subjects:
Online Access:https://hdl.handle.net/10356/139214
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-139214
record_format dspace
spelling sg-ntu-dr.10356-1392142023-07-07T18:53:28Z Human motion tracking using deep learning Peok, Qing Xiang Soong Boon Hee School of Electrical and Electronic Engineering EBHSOONG@ntu.edu.sg Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Engineering::Electrical and electronic engineering Tracking moving objects like pedestrian have a wide range of applications for intelligent autonomous vehicle. While object detection using modern deep learning-based approach such as Convolutional Neural Network (CNN) have advanced significantly, autonomous system such as ground mobile robots still face problems in real-time implementation. These methods include CNN based human tracking due to occlusions, illumination changes and camera view variations. The aim of this project is to identify the optimal deep learning algorithm, in terms of speed and accuracy. Human tracking by detection using selected real-time object detector is also implemented. Firstly, in terms of performance speed, in terms of Frames per Second (FPS), One-stage Detectors, Single Shot Detector (SSD) achieve 19 FPS, You Only Look Once (YOLOv3) achieve 22 FPS as compared to Two-stage Detectors, such as Mask Region-Convolutional Neural Network (Mask R-CNN) which achieve 1 FPS, with this, it is concluded that One-stage Detectors is more suitable for real time implementation. In addition, Multiple Object Tracking Precision (MOTP) is used as a measure for detection accuracy, it is found that SSD achieved 68.4%, YOLOv3 achieved 72.9% and Mask R-CNN achieved 74.0%. Thus, YOLOv3 is used in this project as it is the optimal object detector. Next, the detection output coordinates as bounding boxes (in pixels) from the YOLOv3 is passed to the tracking algorithm that associate detections over frames to create a trajectory of any tracked person. The decision rule to associate detection over frames is computed by Hungarian Algorithm (Kuhn-Munkres) that takes on motion affinity and appearance similarity as inputs. For motion affinity, Intersection Over Union (IOU) provides information on physical proximity between the new detections and the tracked persons. The appearance similarity is measured by computing Euclidean distance between the last CNN features of the detected person. Promising results were obtained in the various tests carried out. Finally, the tracking algorithm is tested on mobile robot with Robot Operating System (ROS) framework, with images captured from Orbbec Astra RGB-D camera which is passed into object detection algorithm. The tracking is performed subsequently in real-time. Two major challenges faced during testing were ID switching and missing detections. While the latter is limited by the detection algorithm, ID switch occurs when two similar looking persons are very close and tracking algorithm are unable to differentiate them. Bachelor of Engineering (Electrical and Electronic Engineering) 2020-05-18T05:29:52Z 2020-05-18T05:29:52Z 2020 Final Year Project (FYP) https://hdl.handle.net/10356/139214 en A3231-191 application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Engineering::Electrical and electronic engineering
spellingShingle Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Engineering::Electrical and electronic engineering
Peok, Qing Xiang
Human motion tracking using deep learning
description Tracking moving objects like pedestrian have a wide range of applications for intelligent autonomous vehicle. While object detection using modern deep learning-based approach such as Convolutional Neural Network (CNN) have advanced significantly, autonomous system such as ground mobile robots still face problems in real-time implementation. These methods include CNN based human tracking due to occlusions, illumination changes and camera view variations. The aim of this project is to identify the optimal deep learning algorithm, in terms of speed and accuracy. Human tracking by detection using selected real-time object detector is also implemented. Firstly, in terms of performance speed, in terms of Frames per Second (FPS), One-stage Detectors, Single Shot Detector (SSD) achieve 19 FPS, You Only Look Once (YOLOv3) achieve 22 FPS as compared to Two-stage Detectors, such as Mask Region-Convolutional Neural Network (Mask R-CNN) which achieve 1 FPS, with this, it is concluded that One-stage Detectors is more suitable for real time implementation. In addition, Multiple Object Tracking Precision (MOTP) is used as a measure for detection accuracy, it is found that SSD achieved 68.4%, YOLOv3 achieved 72.9% and Mask R-CNN achieved 74.0%. Thus, YOLOv3 is used in this project as it is the optimal object detector. Next, the detection output coordinates as bounding boxes (in pixels) from the YOLOv3 is passed to the tracking algorithm that associate detections over frames to create a trajectory of any tracked person. The decision rule to associate detection over frames is computed by Hungarian Algorithm (Kuhn-Munkres) that takes on motion affinity and appearance similarity as inputs. For motion affinity, Intersection Over Union (IOU) provides information on physical proximity between the new detections and the tracked persons. The appearance similarity is measured by computing Euclidean distance between the last CNN features of the detected person. Promising results were obtained in the various tests carried out. Finally, the tracking algorithm is tested on mobile robot with Robot Operating System (ROS) framework, with images captured from Orbbec Astra RGB-D camera which is passed into object detection algorithm. The tracking is performed subsequently in real-time. Two major challenges faced during testing were ID switching and missing detections. While the latter is limited by the detection algorithm, ID switch occurs when two similar looking persons are very close and tracking algorithm are unable to differentiate them.
author2 Soong Boon Hee
author_facet Soong Boon Hee
Peok, Qing Xiang
format Final Year Project
author Peok, Qing Xiang
author_sort Peok, Qing Xiang
title Human motion tracking using deep learning
title_short Human motion tracking using deep learning
title_full Human motion tracking using deep learning
title_fullStr Human motion tracking using deep learning
title_full_unstemmed Human motion tracking using deep learning
title_sort human motion tracking using deep learning
publisher Nanyang Technological University
publishDate 2020
url https://hdl.handle.net/10356/139214
_version_ 1772827283888799744