Deep learning based people detection using 3D point cloud

With the advancement of computational devices and 3D sensor technology, it has become increasingly viable to develop a highly accurate detection system within the constraints of a mobile service robot. As such robots need to navigate in unfamiliar environments with less than optimal conditions, the...

Full description

Saved in:
Bibliographic Details
Main Author: Tan, Kye Min
Other Authors: Teoh Eam Khwang
Format: Final Year Project
Language:English
Published: 2018
Subjects:
Online Access:http://hdl.handle.net/10356/74956
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:With the advancement of computational devices and 3D sensor technology, it has become increasingly viable to develop a highly accurate detection system within the constraints of a mobile service robot. As such robots need to navigate in unfamiliar environments with less than optimal conditions, the algorithms responsible for detection, tracking and guidance must be robust. Deep learning is a recent field of artificial intelligence which potentially provides such features. By harnessing large amounts of computational power and datasets, deep learning systems can achieve significantly better performance in computer vision tasks such as classification and detection compared to previous methods. The usage of 3D point cloud data allows spatial information to be obtained while overcoming adverse conditions such as poor illumination and complex texture information. This project combines the advantages of deep learning methods and 3D point cloud data to perform people detection, which is a task required of mobile service robots. Depth images from the Microsoft Kinect sensors are converted into 3D point cloud form before being used to train an advanced network known as DenseNet for the task of detecting the presence of people. DenseNet was chosen due to its very deep architecture which allows high performance while its dense connections mitigate the risk of the model overfitting on the limited data available. By training DenseNet on the Darknet framework, it is qualitatively shown that DenseNet can perform better than networks like You Only Look Once (YOLO) on new data while being sufficiently fast to process images in real time.