FASFLNet: feature adaptive selection and fusion lightweight network for RGB-D indoor scene parsing
RGB-D indoor scene parsing is a challenging task in computer vision. Conventional scene-parsing approaches based on manual feature extraction have proved inadequate in this area because indoor scenes are both unordered and complex. This study proposes a feature adaptive selection, and fusion lightwe...
Saved in:
Main Authors: | , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2023
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/171471 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-171471 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1714712023-10-27T15:36:11Z FASFLNet: feature adaptive selection and fusion lightweight network for RGB-D indoor scene parsing Qian, Xiaohong Lin, Xingyang Yu, Lu Zhou, Wujie School of Computer Science and Engineering Engineering::Computer science and engineering Lightweight Backbone Model RGB-D RGB-D indoor scene parsing is a challenging task in computer vision. Conventional scene-parsing approaches based on manual feature extraction have proved inadequate in this area because indoor scenes are both unordered and complex. This study proposes a feature adaptive selection, and fusion lightweight network (FASFLNet) for RGB-D indoor scene parsing that is both efficient and accurate. The proposed FASFLNet utilizes a lightweight classification network (MobileNetV2), constituting the backbone of the feature extraction. This lightweight backbone model guarantees that FASFLNet is not only highly efficient but also provides good performance in terms of feature extraction. The additional information provided by depth images (specifically, spatial information such as the shape and scale of objects) is used in FASFLNet as supplemental information for feature-level adaptive fusion between the RGB and depth streams. Furthermore, during decoding, the features of different layers are fused from top-bottom and integrated at different layers for final pixel-level classification, resulting in an effect similar to that of pyramid supervision. Experimental results obtained on the NYU V2 and SUN RGB-D datasets indicate that the proposed FASFLNet outperforms existing state-of-the-art models and is both highly efficient and accurate. Published version This work was funded by National Natural Science Foundation of China (61502429). 2023-10-26T01:06:57Z 2023-10-26T01:06:57Z 2023 Journal Article Qian, X., Lin, X., Yu, L. & Zhou, W. (2023). FASFLNet: feature adaptive selection and fusion lightweight network for RGB-D indoor scene parsing. Optics Express, 31(5), 8029-8041. https://dx.doi.org/10.1364/OE.480252 1094-4087 https://hdl.handle.net/10356/171471 10.1364/OE.480252 36859921 2-s2.0-85149146081 5 31 8029 8041 en Optics Express © 2023 Optica Publishing Group under the terms of the Optica Open Access Publishing Agreement. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Computer science and engineering Lightweight Backbone Model RGB-D |
spellingShingle |
Engineering::Computer science and engineering Lightweight Backbone Model RGB-D Qian, Xiaohong Lin, Xingyang Yu, Lu Zhou, Wujie FASFLNet: feature adaptive selection and fusion lightweight network for RGB-D indoor scene parsing |
description |
RGB-D indoor scene parsing is a challenging task in computer vision. Conventional scene-parsing approaches based on manual feature extraction have proved inadequate in this area because indoor scenes are both unordered and complex. This study proposes a feature adaptive selection, and fusion lightweight network (FASFLNet) for RGB-D indoor scene parsing that is both efficient and accurate. The proposed FASFLNet utilizes a lightweight classification network (MobileNetV2), constituting the backbone of the feature extraction. This lightweight backbone model guarantees that FASFLNet is not only highly efficient but also provides good performance in terms of feature extraction. The additional information provided by depth images (specifically, spatial information such as the shape and scale of objects) is used in FASFLNet as supplemental information for feature-level adaptive fusion between the RGB and depth streams. Furthermore, during decoding, the features of different layers are fused from top-bottom and integrated at different layers for final pixel-level classification, resulting in an effect similar to that of pyramid supervision. Experimental results obtained on the NYU V2 and SUN RGB-D datasets indicate that the proposed FASFLNet outperforms existing state-of-the-art models and is both highly efficient and accurate. |
author2 |
School of Computer Science and Engineering |
author_facet |
School of Computer Science and Engineering Qian, Xiaohong Lin, Xingyang Yu, Lu Zhou, Wujie |
format |
Article |
author |
Qian, Xiaohong Lin, Xingyang Yu, Lu Zhou, Wujie |
author_sort |
Qian, Xiaohong |
title |
FASFLNet: feature adaptive selection and fusion lightweight network for RGB-D indoor scene parsing |
title_short |
FASFLNet: feature adaptive selection and fusion lightweight network for RGB-D indoor scene parsing |
title_full |
FASFLNet: feature adaptive selection and fusion lightweight network for RGB-D indoor scene parsing |
title_fullStr |
FASFLNet: feature adaptive selection and fusion lightweight network for RGB-D indoor scene parsing |
title_full_unstemmed |
FASFLNet: feature adaptive selection and fusion lightweight network for RGB-D indoor scene parsing |
title_sort |
fasflnet: feature adaptive selection and fusion lightweight network for rgb-d indoor scene parsing |
publishDate |
2023 |
url |
https://hdl.handle.net/10356/171471 |
_version_ |
1781793714692161536 |