TranSiam: Aggregating multi-modal visual features with locality for medical image segmentation

Automatic segmentation of medical images plays an important role in the diagnosis of diseases. On single-modal data, convolutional neural networks have demonstrated satisfactory performance. However, multi-modal data encompasses a greater amount of information rather than single-modal data. Multi-mo...

Full description

Saved in:

Bibliographic Details
Main Authors:	LI, Xuejian, MA, Shiqiang, XU, Junhai, TANG, Jijun, HE, Shengfeng, GUO, Fei
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2024
Subjects:	Feature-level fusion Local attention mechanism Medical image segmentation Multi-modal fusion Graphics and Human Computer Interfaces Health Information Technology
Online Access:	https://ink.library.smu.edu.sg/sis_research/8222 https://ink.library.smu.edu.sg/context/sis_research/article/9225/viewcontent/TranSiam_av.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-9225
record_format	dspace
spelling	sg-smu-ink.sis_research-92252023-11-08T05:17:31Z TranSiam: Aggregating multi-modal visual features with locality for medical image segmentation LI, Xuejian MA, Shiqiang XU, Junhai TANG, Jijun HE, Shengfeng GUO, Fei Automatic segmentation of medical images plays an important role in the diagnosis of diseases. On single-modal data, convolutional neural networks have demonstrated satisfactory performance. However, multi-modal data encompasses a greater amount of information rather than single-modal data. Multi-modal data can be effectively used to improve the segmentation accuracy of regions of interest by analyzing both spatial and temporal information. In this study, we propose a dual-path segmentation model for multi-modal medical images, named TranSiam. Taking into account that there is a significant diversity between the different modalities, TranSiam employs two parallel CNNs to extract the features which are specific to each of the modalities. In our method, two parallel CNNs extract detailed and local information in the low-level layer, and the Transformer layer extracts global information in the high-level layer. Finally, we fuse the features of different modalities via a locality-aware aggregation block (LAA block) to establish the association between different modal features. The LAA block is used to locate the region of interest and suppress the influence of invalid regions on multi-modal feature fusion. TranSiam uses LAA blocks at each layer of the encoder in order to fully fuse multi-modal information at different scales. Extensive experiments on several multi-modal datasets have shown that TranSiam achieves satisfying results. 2024-03-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/8222 info:doi/10.1016/j.eswa.2023.121574 https://ink.library.smu.edu.sg/context/sis_research/article/9225/viewcontent/TranSiam_av.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Feature-level fusion Local attention mechanism Medical image segmentation Multi-modal fusion Graphics and Human Computer Interfaces Health Information Technology
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Feature-level fusion Local attention mechanism Medical image segmentation Multi-modal fusion Graphics and Human Computer Interfaces Health Information Technology
spellingShingle	Feature-level fusion Local attention mechanism Medical image segmentation Multi-modal fusion Graphics and Human Computer Interfaces Health Information Technology LI, Xuejian MA, Shiqiang XU, Junhai TANG, Jijun HE, Shengfeng GUO, Fei TranSiam: Aggregating multi-modal visual features with locality for medical image segmentation
description	Automatic segmentation of medical images plays an important role in the diagnosis of diseases. On single-modal data, convolutional neural networks have demonstrated satisfactory performance. However, multi-modal data encompasses a greater amount of information rather than single-modal data. Multi-modal data can be effectively used to improve the segmentation accuracy of regions of interest by analyzing both spatial and temporal information. In this study, we propose a dual-path segmentation model for multi-modal medical images, named TranSiam. Taking into account that there is a significant diversity between the different modalities, TranSiam employs two parallel CNNs to extract the features which are specific to each of the modalities. In our method, two parallel CNNs extract detailed and local information in the low-level layer, and the Transformer layer extracts global information in the high-level layer. Finally, we fuse the features of different modalities via a locality-aware aggregation block (LAA block) to establish the association between different modal features. The LAA block is used to locate the region of interest and suppress the influence of invalid regions on multi-modal feature fusion. TranSiam uses LAA blocks at each layer of the encoder in order to fully fuse multi-modal information at different scales. Extensive experiments on several multi-modal datasets have shown that TranSiam achieves satisfying results.
format	text
author	LI, Xuejian MA, Shiqiang XU, Junhai TANG, Jijun HE, Shengfeng GUO, Fei
author_facet	LI, Xuejian MA, Shiqiang XU, Junhai TANG, Jijun HE, Shengfeng GUO, Fei
author_sort	LI, Xuejian
title	TranSiam: Aggregating multi-modal visual features with locality for medical image segmentation
title_short	TranSiam: Aggregating multi-modal visual features with locality for medical image segmentation
title_full	TranSiam: Aggregating multi-modal visual features with locality for medical image segmentation
title_fullStr	TranSiam: Aggregating multi-modal visual features with locality for medical image segmentation
title_full_unstemmed	TranSiam: Aggregating multi-modal visual features with locality for medical image segmentation
title_sort	transiam: aggregating multi-modal visual features with locality for medical image segmentation
publisher	Institutional Knowledge at Singapore Management University
publishDate	2024
url	https://ink.library.smu.edu.sg/sis_research/8222 https://ink.library.smu.edu.sg/context/sis_research/article/9225/viewcontent/TranSiam_av.pdf
_version_	1783955656518139904

TranSiam: Aggregating multi-modal visual features with locality for medical image segmentation

Similar Items