Pseudo vision: end-to-end autonomous driving with 2D LiDAR

This Project introduced a novel data representation called Pseudo Vision that enables End-to-end Autonomous Driving using 2D LiDAR as the sole sensor for the perception of the vehicle’s surroundings. Pseudo Vision is interoperable across 2D LiDARs of any Points Per Scan (PPS) and allows an explainab...

Full description

Saved in:
Bibliographic Details
Main Author: Chau, Yuan Qi
Other Authors: Heng Kok Hui, John Gerard
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2022
Subjects:
Online Access:https://hdl.handle.net/10356/163470
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:This Project introduced a novel data representation called Pseudo Vision that enables End-to-end Autonomous Driving using 2D LiDAR as the sole sensor for the perception of the vehicle’s surroundings. Pseudo Vision is interoperable across 2D LiDARs of any Points Per Scan (PPS) and allows an explainable decision-making process by visualizing the feature. The Pseudo Vision Data Representation has two parameters, i.e. The position of the ego-vehicle and the Resolution of the image. Experiments have been carried out to investigate the relationships between each parameter and the impact on driving performances across three models, i.e. 3-layer Fully-Connected Neural Network (FCNN), a 3-layer Convolutional Neural Network (CNN), and a state-of-the-art Convolutional Neural Network (SOTA-CNN). The SOTA-CNN model is selected by doing a benchmark of 135 State-of-the-art CNN models, with metrics like Accuracy & Inference Time both on PyTorch & TensorRT and the number of parameters for each model made available to researchers and engineers that might need an empirical model selection. From the experiments, it is concluded that Convolutional Neural Networks or CNN-based model allows the visualization and understanding of the mechanism in which they extract features from imagery inputs and make decisions, which should be preferred use in situations where understanding of the context of the problem is required over a brute-force model like FCNN. It is also shown that it is possible to transfer learning from a different task like classification on ImageNet1k to an autonomous driving task; There is a significant increase in pre-trained CNN and pre- trained SOTA-CNN models’ driving performance when compared to their respective non-pre-trained models. Not only that, the Pseudo Vision performs better when the position of the ego-vehicle is below the center of the image because it reserves more space for information that is in front of the vehicle than behind the vehicle in the Pseudo Vision, which is more important during driving; The Pseudo Vision performs better when the resolution of the image is large because it allows the models to learn to differentiate smaller distances.