Depth perception in challenging weathers
Depth information is a basis for numerous applications in robotics and autonomous driving. With the commonly considered sensors (i.e. color camera and 3D Lidar) in robotics, the mainstream approaches to acquiring dense depth maps are depth estimation based on a monocular camera, stereo matching base...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Doctor of Philosophy |
Language: | English |
Published: |
Nanyang Technological University
2023
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/166568 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-166568 |
---|---|
record_format |
dspace |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Electrical and electronic engineering |
spellingShingle |
Engineering::Electrical and electronic engineering Zhang, Haoyuan Depth perception in challenging weathers |
description |
Depth information is a basis for numerous applications in robotics and autonomous driving. With the commonly considered sensors (i.e. color camera and 3D Lidar) in robotics, the mainstream approaches to acquiring dense depth maps are depth estimation based on a monocular camera, stereo matching based on dual cameras, and
depth completion based on a single camera and sparse Lidar. According to the properties of different modalities involved, the reliability and robustness are naturally divergent. In recent years, although the deep-learning-based solutions boost the performance of various approaches, most methods are aimed only at ideal conditions with satisfying illumination and clear view. However, as outdoor environments are unavoidable for robots and autonomous vehicles, depth perception in the various light conditions and weather effects is a meaningful task and open problem. In this
thesis, various approaches to acquiring dense depth maps and different strategies
to promote the deep-neural-network (DNN) models are investigated. Based on
the novel frameworks proposed in the thesis, the empirical experiments have shown
significant improvement in various experimental settings, which soften the difficulty
of acquiring training data in various conditions.
First of all, since the color image is the most mature modality in deep learning and
stereo matching provides complete theory, the approach to acquiring depth maps in
challenging conditions by stereo images is investigated. A supervised transfer learning
strategy is applied to a delicately designed integration of a condition-specific
perception enhancement network and a fast convolutional stereo matching algorithm. Besides, as the effectiveness of neural networks is greatly influenced by the
quantity of data and the collection of synthetic data is much easier than that of
real-world normal data, an unsupervised domain adaptation framework is proposed
and validated in the synthetic to a real-world setting. Moreover, a novel loss function
named soft warping loss is proposed to not only speed up the training process
but also promote better performance. The synthetic to real-world setting is validated
in most existing based on the assumption that weather and light conditions
in the synthetic domain are less challenging than real-world data. To further test
the effectiveness of the proposed methods, experiments on synthetic ideal weather
to real-world adverse weather data are conducted, which verifies that the framework
and domain are independent.
Although the unsupervised domain adaptation for stereo matching requires no label
on the target domain, the natural vulnerability of color cameras can lead to poor
robustness under adverse conditions. To leverage the instinctive robustness of the
Lidar modality in various light conditions, a review of modality fusion approaches
are conducted, which validates the existing fusion method suffers from redundancy
and lack of geometry guidance. To reduce the channel redundancy and embed
the geometry information in the spatial dimension, the Geometry Attention-based
A lightweight Fusion (GALiF) backbone is proposed. Besides, an entropy-based loss
function is proposed to fulfill the potential of weighted summation. Based on the
maximum entropy theory, the loss leads both branches to compete,
resulting in a more active performance of different parts of the network. With the
proposed methods, the fusion of color modality and Lidar modality achieves state-of-
the-art performance in the literature on benchmarks.
Finally, a label-free training strategy is proposed for depth completion, which is
based only on noisy Lidar data and color images. The problem setting is most challenging
due to the tremendously complicated process of label acquisition. The whole
pipeline relies on a self-supervised learning framework and pseudo label generation.
We developed a statistical filter and utilized the conventional domain-free method to generate dense depth maps as final pseudo maps. As the commonly used label collection
condenses several Lidar frames into one dense depth map, the procedure is
infeasible in adverse weather due to the low quality of a single frame. Moreover, the
GAN-based style transfer is also not feasible due to rare adverse weather data collected.
The proposed self-supervised learning framework explores a novel approach
to realize fine-tuning without labels or external expensive sensors. Experiments on
real-world adverse weather data show a significant error drop approximately ranging
from 30%-50% according to different degrees of noise induced by the weather. |
author2 |
Wang Dan Wei |
author_facet |
Wang Dan Wei Zhang, Haoyuan |
format |
Thesis-Doctor of Philosophy |
author |
Zhang, Haoyuan |
author_sort |
Zhang, Haoyuan |
title |
Depth perception in challenging weathers |
title_short |
Depth perception in challenging weathers |
title_full |
Depth perception in challenging weathers |
title_fullStr |
Depth perception in challenging weathers |
title_full_unstemmed |
Depth perception in challenging weathers |
title_sort |
depth perception in challenging weathers |
publisher |
Nanyang Technological University |
publishDate |
2023 |
url |
https://hdl.handle.net/10356/166568 |
_version_ |
1772826762349117440 |
spelling |
sg-ntu-dr.10356-1665682023-07-04T17:02:37Z Depth perception in challenging weathers Zhang, Haoyuan Wang Dan Wei School of Electrical and Electronic Engineering EDWWANG@ntu.edu.sg Engineering::Electrical and electronic engineering Depth information is a basis for numerous applications in robotics and autonomous driving. With the commonly considered sensors (i.e. color camera and 3D Lidar) in robotics, the mainstream approaches to acquiring dense depth maps are depth estimation based on a monocular camera, stereo matching based on dual cameras, and depth completion based on a single camera and sparse Lidar. According to the properties of different modalities involved, the reliability and robustness are naturally divergent. In recent years, although the deep-learning-based solutions boost the performance of various approaches, most methods are aimed only at ideal conditions with satisfying illumination and clear view. However, as outdoor environments are unavoidable for robots and autonomous vehicles, depth perception in the various light conditions and weather effects is a meaningful task and open problem. In this thesis, various approaches to acquiring dense depth maps and different strategies to promote the deep-neural-network (DNN) models are investigated. Based on the novel frameworks proposed in the thesis, the empirical experiments have shown significant improvement in various experimental settings, which soften the difficulty of acquiring training data in various conditions. First of all, since the color image is the most mature modality in deep learning and stereo matching provides complete theory, the approach to acquiring depth maps in challenging conditions by stereo images is investigated. A supervised transfer learning strategy is applied to a delicately designed integration of a condition-specific perception enhancement network and a fast convolutional stereo matching algorithm. Besides, as the effectiveness of neural networks is greatly influenced by the quantity of data and the collection of synthetic data is much easier than that of real-world normal data, an unsupervised domain adaptation framework is proposed and validated in the synthetic to a real-world setting. Moreover, a novel loss function named soft warping loss is proposed to not only speed up the training process but also promote better performance. The synthetic to real-world setting is validated in most existing based on the assumption that weather and light conditions in the synthetic domain are less challenging than real-world data. To further test the effectiveness of the proposed methods, experiments on synthetic ideal weather to real-world adverse weather data are conducted, which verifies that the framework and domain are independent. Although the unsupervised domain adaptation for stereo matching requires no label on the target domain, the natural vulnerability of color cameras can lead to poor robustness under adverse conditions. To leverage the instinctive robustness of the Lidar modality in various light conditions, a review of modality fusion approaches are conducted, which validates the existing fusion method suffers from redundancy and lack of geometry guidance. To reduce the channel redundancy and embed the geometry information in the spatial dimension, the Geometry Attention-based A lightweight Fusion (GALiF) backbone is proposed. Besides, an entropy-based loss function is proposed to fulfill the potential of weighted summation. Based on the maximum entropy theory, the loss leads both branches to compete, resulting in a more active performance of different parts of the network. With the proposed methods, the fusion of color modality and Lidar modality achieves state-of- the-art performance in the literature on benchmarks. Finally, a label-free training strategy is proposed for depth completion, which is based only on noisy Lidar data and color images. The problem setting is most challenging due to the tremendously complicated process of label acquisition. The whole pipeline relies on a self-supervised learning framework and pseudo label generation. We developed a statistical filter and utilized the conventional domain-free method to generate dense depth maps as final pseudo maps. As the commonly used label collection condenses several Lidar frames into one dense depth map, the procedure is infeasible in adverse weather due to the low quality of a single frame. Moreover, the GAN-based style transfer is also not feasible due to rare adverse weather data collected. The proposed self-supervised learning framework explores a novel approach to realize fine-tuning without labels or external expensive sensors. Experiments on real-world adverse weather data show a significant error drop approximately ranging from 30%-50% according to different degrees of noise induced by the weather. Doctor of Philosophy 2023-05-04T05:14:49Z 2023-05-04T05:14:49Z 2023 Thesis-Doctor of Philosophy Zhang, H. (2023). Depth perception in challenging weathers. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/166568 https://hdl.handle.net/10356/166568 10.32657/10356/166568 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University |