ACCURACY ENHANCEMENT OF AUTOMATIC ROAD EXTRACTION FROM SATELLITE IMAGES THROUGH VISION TRANSFORMER-BASED SYSTEM DEVELOPMENT
Automated road extraction techniques employing deep learning offer a cost- effective and expeditious alternative to manual approaches, while surpassing semi-automated methods in terms of efficacy. Nevertheless, these methods are still fall short of meeting the accuracy requirements for practica...
Saved in:
Main Author: | |
---|---|
Format: | Theses |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/77944 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
Summary: | Automated road extraction techniques employing deep learning offer a cost-
effective and expeditious alternative to manual approaches, while surpassing
semi-automated methods in terms of efficacy. Nevertheless, these methods are
still fall short of meeting the accuracy requirements for practical real-world
applications. This study enhances the performance of the automated road
extraction process by introducing a novel multi-axis multi-scale attention
network for road extraction, with a primary focus on capturing extended
dependencies. The architecture comprises an encoder-decoder hierarchical
structure, featuring sequentially positioned sparse local attention and dilated
global attention, each accompanied by adjusted patch sizes at each stage.
Distinct dilation rates for grid attention are introduced in the shallower
network stage, effectively amplifying long-range dependencies. An implicit
inductive bias through conditional positional encoding in the feed-forward
network and relative positional bias in the attention model is integrated to
enhance positional encoding efficiency and introduce bias in local patches. In
the decoding phase, a summation-based aggregation strategy is employed,
complemented by a more refined decoder, to facilitate intricate spatial
information recovery. The proposed model was subjected to experimental
validation on the DeepGlobe dataset, yielding results that hold comparability
with several state-of-the-art networks. Additionally, A comprehensive ablation
study was conducted, shedding light on the contributions of embedded modules
within the architecture, and offering insights into meticulous tuning strategies. |
---|