HFENet: hybrid feature encoder network for detecting salient objects in RGB-thermal images

Deep convolutional neural networks (CNNs) have gained prominence in computer vision applications, including RGB salient object detection (SOD), owing to the advancements in deep learning. Nevertheless, the majority of deep CNNs employ either VGGNet or ResNet as their backbone architecture for extrac...

Full description

Saved in:

Bibliographic Details
Main Authors:	Sun, Fan, Zhou, Wujie, Yan, Weiqing, Zhang, Yulai
Other Authors:	School of Computer Science and Engineering
Format:	Article
Language:	English
Published:	2024
Subjects:	Computer and Information Science RGB image Thermal image
Online Access:	https://hdl.handle.net/10356/180263
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-180263
record_format	dspace
spelling	sg-ntu-dr.10356-1802632024-09-25T07:16:26Z HFENet: hybrid feature encoder network for detecting salient objects in RGB-thermal images Sun, Fan Zhou, Wujie Yan, Weiqing Zhang, Yulai School of Computer Science and Engineering Computer and Information Science RGB image Thermal image Deep convolutional neural networks (CNNs) have gained prominence in computer vision applications, including RGB salient object detection (SOD), owing to the advancements in deep learning. Nevertheless, the majority of deep CNNs employ either VGGNet or ResNet as their backbone architecture for extracting image information. This approach may lead to the following problems. 1) Variations between imaging modalities during feature extraction across layers. Cross-modal features across layers are often fused in a single step, resulting in inadequate cross-modal feature extraction. 2) Feature long-range dependence problem in multilayer feature decoding. 3) Image boundary blurring. To address these issues, we initially leverage the advantages offered by the VGGNet and ResNet architectures. Additionally, we present a novel hybrid VGG–ResNet feature encoder for RGB-T SOD. Specifically, we introduce a geometry information aggregation module that effectively combines and enhances the VGGNet spatial features of the RGB-T modalities from the bottom to the top. Moreover, we propose a innovative global saliency perception module that progressively refines the ResNet semantic features from the top to the bottom by integrating both local and global information. Furthermore, we introduce a Pearson-gated module to tackle the challenge of long-range dependence between features. This module utilizes gating to merge features by calculating the Pearson correlation coefficients of the fused features at multiple levels. Lastly, we devise an edge-aware module to precisely learn the contours of salient objects, thereby enhancing the clarity of the object boundaries. Extensive experiments conducted on three RGB-T SOD benchmarks demonstrate that our proposed network surpasses the performance of state-of-the-art methods for SOD. This work was supported by the National Natural Science Foundation of China (grant no. 62371422). 2024-09-25T07:16:26Z 2024-09-25T07:16:26Z 2024 Journal Article Sun, F., Zhou, W., Yan, W. & Zhang, Y. (2024). HFENet: hybrid feature encoder network for detecting salient objects in RGB-thermal images. Digital Signal Processing, 148, 104439-. https://dx.doi.org/10.1016/j.dsp.2024.104439 1051-2004 https://hdl.handle.net/10356/180263 10.1016/j.dsp.2024.104439 2-s2.0-85186511629 148 104439 en Digital Signal Processing © 2024 Elsevier Inc. All rights reserved.
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Computer and Information Science RGB image Thermal image
spellingShingle	Computer and Information Science RGB image Thermal image Sun, Fan Zhou, Wujie Yan, Weiqing Zhang, Yulai HFENet: hybrid feature encoder network for detecting salient objects in RGB-thermal images
description	Deep convolutional neural networks (CNNs) have gained prominence in computer vision applications, including RGB salient object detection (SOD), owing to the advancements in deep learning. Nevertheless, the majority of deep CNNs employ either VGGNet or ResNet as their backbone architecture for extracting image information. This approach may lead to the following problems. 1) Variations between imaging modalities during feature extraction across layers. Cross-modal features across layers are often fused in a single step, resulting in inadequate cross-modal feature extraction. 2) Feature long-range dependence problem in multilayer feature decoding. 3) Image boundary blurring. To address these issues, we initially leverage the advantages offered by the VGGNet and ResNet architectures. Additionally, we present a novel hybrid VGG–ResNet feature encoder for RGB-T SOD. Specifically, we introduce a geometry information aggregation module that effectively combines and enhances the VGGNet spatial features of the RGB-T modalities from the bottom to the top. Moreover, we propose a innovative global saliency perception module that progressively refines the ResNet semantic features from the top to the bottom by integrating both local and global information. Furthermore, we introduce a Pearson-gated module to tackle the challenge of long-range dependence between features. This module utilizes gating to merge features by calculating the Pearson correlation coefficients of the fused features at multiple levels. Lastly, we devise an edge-aware module to precisely learn the contours of salient objects, thereby enhancing the clarity of the object boundaries. Extensive experiments conducted on three RGB-T SOD benchmarks demonstrate that our proposed network surpasses the performance of state-of-the-art methods for SOD.
author2	School of Computer Science and Engineering
author_facet	School of Computer Science and Engineering Sun, Fan Zhou, Wujie Yan, Weiqing Zhang, Yulai
format	Article
author	Sun, Fan Zhou, Wujie Yan, Weiqing Zhang, Yulai
author_sort	Sun, Fan
title	HFENet: hybrid feature encoder network for detecting salient objects in RGB-thermal images
title_short	HFENet: hybrid feature encoder network for detecting salient objects in RGB-thermal images
title_full	HFENet: hybrid feature encoder network for detecting salient objects in RGB-thermal images
title_fullStr	HFENet: hybrid feature encoder network for detecting salient objects in RGB-thermal images
title_full_unstemmed	HFENet: hybrid feature encoder network for detecting salient objects in RGB-thermal images
title_sort	hfenet: hybrid feature encoder network for detecting salient objects in rgb-thermal images
publishDate	2024
url	https://hdl.handle.net/10356/180263
_version_	1814047422967447552

HFENet: hybrid feature encoder network for detecting salient objects in RGB-thermal images

Similar Items