Depth perception in challenging weathers
Depth information is a basis for numerous applications in robotics and autonomous driving. With the commonly considered sensors (i.e. color camera and 3D Lidar) in robotics, the mainstream approaches to acquiring dense depth maps are depth estimation based on a monocular camera, stereo matching base...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Doctor of Philosophy |
Language: | English |
Published: |
Nanyang Technological University
2023
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/166568 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Depth information is a basis for numerous applications in robotics and autonomous driving. With the commonly considered sensors (i.e. color camera and 3D Lidar) in robotics, the mainstream approaches to acquiring dense depth maps are depth estimation based on a monocular camera, stereo matching based on dual cameras, and
depth completion based on a single camera and sparse Lidar. According to the properties of different modalities involved, the reliability and robustness are naturally divergent. In recent years, although the deep-learning-based solutions boost the performance of various approaches, most methods are aimed only at ideal conditions with satisfying illumination and clear view. However, as outdoor environments are unavoidable for robots and autonomous vehicles, depth perception in the various light conditions and weather effects is a meaningful task and open problem. In this
thesis, various approaches to acquiring dense depth maps and different strategies
to promote the deep-neural-network (DNN) models are investigated. Based on
the novel frameworks proposed in the thesis, the empirical experiments have shown
significant improvement in various experimental settings, which soften the difficulty
of acquiring training data in various conditions.
First of all, since the color image is the most mature modality in deep learning and
stereo matching provides complete theory, the approach to acquiring depth maps in
challenging conditions by stereo images is investigated. A supervised transfer learning
strategy is applied to a delicately designed integration of a condition-specific
perception enhancement network and a fast convolutional stereo matching algorithm. Besides, as the effectiveness of neural networks is greatly influenced by the
quantity of data and the collection of synthetic data is much easier than that of
real-world normal data, an unsupervised domain adaptation framework is proposed
and validated in the synthetic to a real-world setting. Moreover, a novel loss function
named soft warping loss is proposed to not only speed up the training process
but also promote better performance. The synthetic to real-world setting is validated
in most existing based on the assumption that weather and light conditions
in the synthetic domain are less challenging than real-world data. To further test
the effectiveness of the proposed methods, experiments on synthetic ideal weather
to real-world adverse weather data are conducted, which verifies that the framework
and domain are independent.
Although the unsupervised domain adaptation for stereo matching requires no label
on the target domain, the natural vulnerability of color cameras can lead to poor
robustness under adverse conditions. To leverage the instinctive robustness of the
Lidar modality in various light conditions, a review of modality fusion approaches
are conducted, which validates the existing fusion method suffers from redundancy
and lack of geometry guidance. To reduce the channel redundancy and embed
the geometry information in the spatial dimension, the Geometry Attention-based
A lightweight Fusion (GALiF) backbone is proposed. Besides, an entropy-based loss
function is proposed to fulfill the potential of weighted summation. Based on the
maximum entropy theory, the loss leads both branches to compete,
resulting in a more active performance of different parts of the network. With the
proposed methods, the fusion of color modality and Lidar modality achieves state-of-
the-art performance in the literature on benchmarks.
Finally, a label-free training strategy is proposed for depth completion, which is
based only on noisy Lidar data and color images. The problem setting is most challenging
due to the tremendously complicated process of label acquisition. The whole
pipeline relies on a self-supervised learning framework and pseudo label generation.
We developed a statistical filter and utilized the conventional domain-free method to generate dense depth maps as final pseudo maps. As the commonly used label collection
condenses several Lidar frames into one dense depth map, the procedure is
infeasible in adverse weather due to the low quality of a single frame. Moreover, the
GAN-based style transfer is also not feasible due to rare adverse weather data collected.
The proposed self-supervised learning framework explores a novel approach
to realize fine-tuning without labels or external expensive sensors. Experiments on
real-world adverse weather data show a significant error drop approximately ranging
from 30%-50% according to different degrees of noise induced by the weather. |
---|