ACCURACY ENHANCEMENT OF AUTOMATIC ROAD EXTRACTION FROM SATELLITE IMAGES THROUGH VISION TRANSFORMER-BASED SYSTEM DEVELOPMENT

Automated road extraction techniques employing deep learning offer a cost- effective and expeditious alternative to manual approaches, while surpassing semi-automated methods in terms of efficacy. Nevertheless, these methods are still fall short of meeting the accuracy requirements for practica...

Full description

Saved in:

Bibliographic Details
Main Author:	Jogy Maratur Siburian, Arthur
Format:	Theses
Language:	Indonesia
Online Access:	https://digilib.itb.ac.id/gdl/view/77944
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Institut Teknologi Bandung
Language:	Indonesia

id	id-itb.:77944
spelling	id-itb.:779442023-09-15T10:36:29ZACCURACY ENHANCEMENT OF AUTOMATIC ROAD EXTRACTION FROM SATELLITE IMAGES THROUGH VISION TRANSFORMER-BASED SYSTEM DEVELOPMENT Jogy Maratur Siburian, Arthur Indonesia Theses Very-high resolution imagery, vision transformer, conditional positional encoding, dilated window attention INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/77944 Automated road extraction techniques employing deep learning offer a cost- effective and expeditious alternative to manual approaches, while surpassing semi-automated methods in terms of efficacy. Nevertheless, these methods are still fall short of meeting the accuracy requirements for practical real-world applications. This study enhances the performance of the automated road extraction process by introducing a novel multi-axis multi-scale attention network for road extraction, with a primary focus on capturing extended dependencies. The architecture comprises an encoder-decoder hierarchical structure, featuring sequentially positioned sparse local attention and dilated global attention, each accompanied by adjusted patch sizes at each stage. Distinct dilation rates for grid attention are introduced in the shallower network stage, effectively amplifying long-range dependencies. An implicit inductive bias through conditional positional encoding in the feed-forward network and relative positional bias in the attention model is integrated to enhance positional encoding efficiency and introduce bias in local patches. In the decoding phase, a summation-based aggregation strategy is employed, complemented by a more refined decoder, to facilitate intricate spatial information recovery. The proposed model was subjected to experimental validation on the DeepGlobe dataset, yielding results that hold comparability with several state-of-the-art networks. Additionally, A comprehensive ablation study was conducted, shedding light on the contributions of embedded modules within the architecture, and offering insights into meticulous tuning strategies. text
institution	Institut Teknologi Bandung
building	Institut Teknologi Bandung Library
continent	Asia
country	Indonesia Indonesia
content_provider	Institut Teknologi Bandung
collection	Digital ITB
language	Indonesia
description	Automated road extraction techniques employing deep learning offer a cost- effective and expeditious alternative to manual approaches, while surpassing semi-automated methods in terms of efficacy. Nevertheless, these methods are still fall short of meeting the accuracy requirements for practical real-world applications. This study enhances the performance of the automated road extraction process by introducing a novel multi-axis multi-scale attention network for road extraction, with a primary focus on capturing extended dependencies. The architecture comprises an encoder-decoder hierarchical structure, featuring sequentially positioned sparse local attention and dilated global attention, each accompanied by adjusted patch sizes at each stage. Distinct dilation rates for grid attention are introduced in the shallower network stage, effectively amplifying long-range dependencies. An implicit inductive bias through conditional positional encoding in the feed-forward network and relative positional bias in the attention model is integrated to enhance positional encoding efficiency and introduce bias in local patches. In the decoding phase, a summation-based aggregation strategy is employed, complemented by a more refined decoder, to facilitate intricate spatial information recovery. The proposed model was subjected to experimental validation on the DeepGlobe dataset, yielding results that hold comparability with several state-of-the-art networks. Additionally, A comprehensive ablation study was conducted, shedding light on the contributions of embedded modules within the architecture, and offering insights into meticulous tuning strategies.
format	Theses
author	Jogy Maratur Siburian, Arthur
spellingShingle	Jogy Maratur Siburian, Arthur ACCURACY ENHANCEMENT OF AUTOMATIC ROAD EXTRACTION FROM SATELLITE IMAGES THROUGH VISION TRANSFORMER-BASED SYSTEM DEVELOPMENT
author_facet	Jogy Maratur Siburian, Arthur
author_sort	Jogy Maratur Siburian, Arthur
title	ACCURACY ENHANCEMENT OF AUTOMATIC ROAD EXTRACTION FROM SATELLITE IMAGES THROUGH VISION TRANSFORMER-BASED SYSTEM DEVELOPMENT
title_short	ACCURACY ENHANCEMENT OF AUTOMATIC ROAD EXTRACTION FROM SATELLITE IMAGES THROUGH VISION TRANSFORMER-BASED SYSTEM DEVELOPMENT
title_full	ACCURACY ENHANCEMENT OF AUTOMATIC ROAD EXTRACTION FROM SATELLITE IMAGES THROUGH VISION TRANSFORMER-BASED SYSTEM DEVELOPMENT
title_fullStr	ACCURACY ENHANCEMENT OF AUTOMATIC ROAD EXTRACTION FROM SATELLITE IMAGES THROUGH VISION TRANSFORMER-BASED SYSTEM DEVELOPMENT
title_full_unstemmed	ACCURACY ENHANCEMENT OF AUTOMATIC ROAD EXTRACTION FROM SATELLITE IMAGES THROUGH VISION TRANSFORMER-BASED SYSTEM DEVELOPMENT
title_sort	accuracy enhancement of automatic road extraction from satellite images through vision transformer-based system development
url	https://digilib.itb.ac.id/gdl/view/77944
_version_	1822995565704642560

ACCURACY ENHANCEMENT OF AUTOMATIC ROAD EXTRACTION FROM SATELLITE IMAGES THROUGH VISION TRANSFORMER-BASED SYSTEM DEVELOPMENT

Similar Items