Monocular BEV perception of road scenes via front-to-top view projection

HD map reconstruction is crucial for autonomous driving. LiDAR-based methods are limited due to expensive sensors and time-consuming computation. Camera-based methods usually need to perform road segmentation and view transformation separately, which often causes distortion and missing content. To p...

Full description

Saved in:

Bibliographic Details
Main Authors:	LIU, Wenxi, LI, Qi, YANG, Weixiang, CAI, Jiaxin, YU, Yuanhong, MA, Yuexin, HE, Shengfeng, PAN, Jia
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2024
Subjects:	Autonomous driving BEV perception Estimation Feature extraction Layout Roads segmentation Task analysis Three-dimensional displays Transformers Artificial Intelligence and Robotics Numerical Analysis and Scientific Computing
Online Access:	https://ink.library.smu.edu.sg/sis_research/8727 https://ink.library.smu.edu.sg/context/sis_research/article/9730/viewcontent/mathematics_12_00916_pvoa_cc_by.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-9730
record_format	dspace
spelling	sg-smu-ink.sis_research-97302024-04-18T07:32:57Z Monocular BEV perception of road scenes via front-to-top view projection LIU, Wenxi LI, Qi YANG, Weixiang CAI, Jiaxin YU, Yuanhong MA, Yuexin HE, Shengfeng PAN, Jia HD map reconstruction is crucial for autonomous driving. LiDAR-based methods are limited due to expensive sensors and time-consuming computation. Camera-based methods usually need to perform road segmentation and view transformation separately, which often causes distortion and missing content. To push the limits of the technology, we present a novel framework that reconstructs a local map formed by road layout and vehicle occupancy in the bird's-eye view given a front-view monocular image only. We propose a front-to-top view projection (FTVP) module, which takes the constraint of cycle consistency between views into account and makes full use of their correlation to strengthen the view transformation and scene understanding. In addition, we apply multi-scale FTVP modules to propagate the rich spatial information of low-level features to mitigate spatial deviation of the predicted object location. Experiments on public benchmarks show that our method achieves various tasks on road layout estimation, vehicle occupancy estimation, and multi-class semantic estimation, at a performance level comparable to the state-of-the-arts, while maintaining superior efficiency. 2024-03-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/8727 info:doi/10.1109/TPAMI.2024.3377812 https://ink.library.smu.edu.sg/context/sis_research/article/9730/viewcontent/mathematics_12_00916_pvoa_cc_by.pdf http://creativecommons.org/licenses/by/3.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Autonomous driving BEV perception Estimation Feature extraction Layout Roads segmentation Task analysis Three-dimensional displays Transformers Artificial Intelligence and Robotics Numerical Analysis and Scientific Computing
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Autonomous driving BEV perception Estimation Feature extraction Layout Roads segmentation Task analysis Three-dimensional displays Transformers Artificial Intelligence and Robotics Numerical Analysis and Scientific Computing
spellingShingle	Autonomous driving BEV perception Estimation Feature extraction Layout Roads segmentation Task analysis Three-dimensional displays Transformers Artificial Intelligence and Robotics Numerical Analysis and Scientific Computing LIU, Wenxi LI, Qi YANG, Weixiang CAI, Jiaxin YU, Yuanhong MA, Yuexin HE, Shengfeng PAN, Jia Monocular BEV perception of road scenes via front-to-top view projection
description	HD map reconstruction is crucial for autonomous driving. LiDAR-based methods are limited due to expensive sensors and time-consuming computation. Camera-based methods usually need to perform road segmentation and view transformation separately, which often causes distortion and missing content. To push the limits of the technology, we present a novel framework that reconstructs a local map formed by road layout and vehicle occupancy in the bird's-eye view given a front-view monocular image only. We propose a front-to-top view projection (FTVP) module, which takes the constraint of cycle consistency between views into account and makes full use of their correlation to strengthen the view transformation and scene understanding. In addition, we apply multi-scale FTVP modules to propagate the rich spatial information of low-level features to mitigate spatial deviation of the predicted object location. Experiments on public benchmarks show that our method achieves various tasks on road layout estimation, vehicle occupancy estimation, and multi-class semantic estimation, at a performance level comparable to the state-of-the-arts, while maintaining superior efficiency.
format	text
author	LIU, Wenxi LI, Qi YANG, Weixiang CAI, Jiaxin YU, Yuanhong MA, Yuexin HE, Shengfeng PAN, Jia
author_facet	LIU, Wenxi LI, Qi YANG, Weixiang CAI, Jiaxin YU, Yuanhong MA, Yuexin HE, Shengfeng PAN, Jia
author_sort	LIU, Wenxi
title	Monocular BEV perception of road scenes via front-to-top view projection
title_short	Monocular BEV perception of road scenes via front-to-top view projection
title_full	Monocular BEV perception of road scenes via front-to-top view projection
title_fullStr	Monocular BEV perception of road scenes via front-to-top view projection
title_full_unstemmed	Monocular BEV perception of road scenes via front-to-top view projection
title_sort	monocular bev perception of road scenes via front-to-top view projection
publisher	Institutional Knowledge at Singapore Management University
publishDate	2024
url	https://ink.library.smu.edu.sg/sis_research/8727 https://ink.library.smu.edu.sg/context/sis_research/article/9730/viewcontent/mathematics_12_00916_pvoa_cc_by.pdf
_version_	1814047495175536640

Monocular BEV perception of road scenes via front-to-top view projection

Similar Items