Vision transformer as image fusion model
Vision transformers show the state-of-art performance in vision tasks, the self attention block works not only limited to NLP tasks but also perform well in process images. In this report, I investigated whether this performance can be further extended into more detailed tasks on images by combining...
Saved in:
Main Author: | Zhao, Fengye |
---|---|
Other Authors: | Zinovi Rabinovich |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2023
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/166048 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Similar Items
-
Improving end-to-end transformer model architecture in ASR
by: Zhao, Yingzhu
Published: (2023) -
Vision language representation learning
by: Yang, Xiaofeng
Published: (2023) -
Car cabin surveillance using computer vision
by: Soegeng, Andrew Ivan
Published: (2022) -
Data efficient learning for 3D computer vision
by: Wei, Jiacheng
Published: (2023) -
Deep neural network compression for pixel-level vision tasks
by: He, Wei
Published: (2021)