ACCURACY ENHANCEMENT OF AUTOMATIC ROAD EXTRACTION FROM SATELLITE IMAGES THROUGH VISION TRANSFORMER-BASED SYSTEM DEVELOPMENT

Automated road extraction techniques employing deep learning offer a cost- effective and expeditious alternative to manual approaches, while surpassing semi-automated methods in terms of efficacy. Nevertheless, these methods are still fall short of meeting the accuracy requirements for practica...

Full description

Saved in:
Bibliographic Details
Main Author: Jogy Maratur Siburian, Arthur
Format: Theses
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/77944
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:Automated road extraction techniques employing deep learning offer a cost- effective and expeditious alternative to manual approaches, while surpassing semi-automated methods in terms of efficacy. Nevertheless, these methods are still fall short of meeting the accuracy requirements for practical real-world applications. This study enhances the performance of the automated road extraction process by introducing a novel multi-axis multi-scale attention network for road extraction, with a primary focus on capturing extended dependencies. The architecture comprises an encoder-decoder hierarchical structure, featuring sequentially positioned sparse local attention and dilated global attention, each accompanied by adjusted patch sizes at each stage. Distinct dilation rates for grid attention are introduced in the shallower network stage, effectively amplifying long-range dependencies. An implicit inductive bias through conditional positional encoding in the feed-forward network and relative positional bias in the attention model is integrated to enhance positional encoding efficiency and introduce bias in local patches. In the decoding phase, a summation-based aggregation strategy is employed, complemented by a more refined decoder, to facilitate intricate spatial information recovery. The proposed model was subjected to experimental validation on the DeepGlobe dataset, yielding results that hold comparability with several state-of-the-art networks. Additionally, A comprehensive ablation study was conducted, shedding light on the contributions of embedded modules within the architecture, and offering insights into meticulous tuning strategies.