Structure-aware fusion network for 3D scene understanding

In this paper, we propose a Structure-Aware Fusion Network (SAFNet) for 3D scene understanding. As 2D images present more detailed information while 3D point clouds convey more geometric information, fusing the two complementary data can improve the discriminative ability of the model. Fusion is a v...

Full description

Saved in:
Bibliographic Details
Main Authors: Yan, Haibin, Lv, Yating, Liong, Venice Erin
Other Authors: Interdisciplinary Graduate School (IGS)
Format: Article
Language:English
Published: 2022
Subjects:
Online Access:https://hdl.handle.net/10356/161283
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-161283
record_format dspace
spelling sg-ntu-dr.10356-1612832023-03-05T16:28:29Z Structure-aware fusion network for 3D scene understanding Yan, Haibin Lv, Yating Liong, Venice Erin Interdisciplinary Graduate School (IGS) Engineering::Electrical and electronic engineering 3D Point Clouds Data Fusion In this paper, we propose a Structure-Aware Fusion Network (SAFNet) for 3D scene understanding. As 2D images present more detailed information while 3D point clouds convey more geometric information, fusing the two complementary data can improve the discriminative ability of the model. Fusion is a very challenging task since 2D and 3D data are essentially different and show different formats. The existing methods first extract 2D multi-view image features and then aggregate them into sparse 3D point clouds and achieve superior performance. However, the existing methods ignore the structural relations between pixels and point clouds and directly fuse the two modals of data without adaptation. To address this, we propose a structural deep metric learning method on pixels and points to explore the relations and further utilize them to adaptively map the images and point clouds into a common canonical space for prediction. Extensive experiments on the widely used ScanNetV2 and S3DIS datasets verify the performance of the proposed SAFNet. Published version This study was supported by the National Natural Science Foundation of China (No. 61976023). 2022-08-23T08:23:58Z 2022-08-23T08:23:58Z 2022 Journal Article Yan, H., Lv, Y. & Liong, V. E. (2022). Structure-aware fusion network for 3D scene understanding. Chinese Journal of Aeronautics, 35(5), 194-203. https://dx.doi.org/10.1016/j.cja.2021.07.012 1000-9361 https://hdl.handle.net/10356/161283 10.1016/j.cja.2021.07.012 2-s2.0-85122365759 5 35 194 203 en Chinese Journal of Aeronautics © 2021 Chinese Society of Aeronautics and Astronautics. Production and hosting by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Electrical and electronic engineering
3D Point Clouds
Data Fusion
spellingShingle Engineering::Electrical and electronic engineering
3D Point Clouds
Data Fusion
Yan, Haibin
Lv, Yating
Liong, Venice Erin
Structure-aware fusion network for 3D scene understanding
description In this paper, we propose a Structure-Aware Fusion Network (SAFNet) for 3D scene understanding. As 2D images present more detailed information while 3D point clouds convey more geometric information, fusing the two complementary data can improve the discriminative ability of the model. Fusion is a very challenging task since 2D and 3D data are essentially different and show different formats. The existing methods first extract 2D multi-view image features and then aggregate them into sparse 3D point clouds and achieve superior performance. However, the existing methods ignore the structural relations between pixels and point clouds and directly fuse the two modals of data without adaptation. To address this, we propose a structural deep metric learning method on pixels and points to explore the relations and further utilize them to adaptively map the images and point clouds into a common canonical space for prediction. Extensive experiments on the widely used ScanNetV2 and S3DIS datasets verify the performance of the proposed SAFNet.
author2 Interdisciplinary Graduate School (IGS)
author_facet Interdisciplinary Graduate School (IGS)
Yan, Haibin
Lv, Yating
Liong, Venice Erin
format Article
author Yan, Haibin
Lv, Yating
Liong, Venice Erin
author_sort Yan, Haibin
title Structure-aware fusion network for 3D scene understanding
title_short Structure-aware fusion network for 3D scene understanding
title_full Structure-aware fusion network for 3D scene understanding
title_fullStr Structure-aware fusion network for 3D scene understanding
title_full_unstemmed Structure-aware fusion network for 3D scene understanding
title_sort structure-aware fusion network for 3d scene understanding
publishDate 2022
url https://hdl.handle.net/10356/161283
_version_ 1759853990341771264