Vision transformer as image fusion model

Vision transformer as image fusion model

Vision transformers show the state-of-art performance in vision tasks, the self attention block works not only limited to NLP tasks but also perform well in process images. In this report, I investigated whether this performance can be further extended into more detailed tasks on images by combining...

Full description

Saved in:

Bibliographic Details
Main Author:	Zhao, Fengye
Other Authors:	Zinovi Rabinovich
Format:	Final Year Project
Language:	English
Published:	Nanyang Technological University 2023
Subjects:	Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Online Access:	https://hdl.handle.net/10356/166048
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

Similar Items

Improving end-to-end transformer model architecture in ASR
by: Zhao, Yingzhu
Published: (2023)

Vision language representation learning
by: Yang, Xiaofeng
Published: (2023)

Car cabin surveillance using computer vision
by: Soegeng, Andrew Ivan
Published: (2022)

Data efficient learning for 3D computer vision
by: Wei, Jiacheng
Published: (2023)

Deep neural network compression for pixel-level vision tasks
by: He, Wei
Published: (2021)

Vision-based 3D human and hand pose analysis
by: Cai, Yujun
Published: (2021)

Robust and efficient deep learning methods for vision-based action recognition
by: Xu, Yuecong
Published: (2021)

Scene understanding based on heterogeneous data fusion
by: Ren, Haosu
Published: (2018)

Gold price prediction using transformers
by: Wong, Stanley Qi Ren
Published: (2023)

Face transformation using StyleGAN
by: Chua, Zhong En
Published: (2022)

Clustering and heterogeneous information fusion for social media theme discovery and associative mining
by: Meng, Lei
Published: (2015)

A robotic vision system with incorporation of artificial intelligence for industrial application
by: Yong, Inn Nam.
Published: (2009)

Sentiment detection with bidirectional encoder representations from transformers
by: Pyae, Hlian Moe
Published: (2021)

An empirical study on adaptation methods for large-scale vision-language models
by: Wang, Annan
Published: (2023)

Stock forecasting using transformers, an emerging machine learning technique
by: Seoh, Jun Yu
Published: (2022)

Multi-stream social-aware transformers for deterministic trajectory prediction
by: Chen, Xun
Published: (2024)

Multi-view fusion and machine learning in hand pose estimation from depth images
by: Ong, Bee Lee
Published: (2018)

Forgery localization in images
by: Nur Dilah Binte Zaini
Published: (2023)

When contrastive learning meets clustering : explore inter-image contrast for image representation learning
by: Li, Shenggui
Published: (2021)

Language models are domain-specific chart analysts
by: Zhao, Yinjie
Published: (2023)

Synthesizing data for multiclass image classification
by: Lee, Tian Fa
Published: (2020)

Neural logic vision language explainer
by: Yang, Xiaofeng, et al.
Published: (2023)

A survery on CNN transfer learning for image classification
by: Teo, Jia Sheng
Published: (2023)

Variational maximization-maximization of Bayesian mixture models and application to unsupervised image classification
by: Lim, Kart-Leong
Published: (2018)

HypLiLoc: towards effective LiDAR pose regression with hyperbolic fusion
by: Wang, Sijie, et al.
Published: (2023)

Transforming thermal comfort model and control in the tropics : a machine-learning approach
by: Hu, Weizheng
Published: (2020)

Advanced image understanding with deep learning in real-world applications
by: Shi, Yuxin
Published: (2020)

A comparative study of edge detection techniques for AI-based image recognition
by: Wang, Di
Published: (2020)

Reconstruction of 3D mesh from 2D image using deep learning
by: Lee, Wonn Jen
Published: (2022)

Image and multimedia processing using computational intelligence
by: Yap, Kim Hui.
Published: (2008)

ModelPS : an open-source and collaborative model edit platform with interactive transfer learning
by: Li, Yuanming
Published: (2021)

Deep CNN-LSTM supervised model and CNN self-supervised model for human activity recognition
by: Liao, Zixin
Published: (2023)

World model with PSR components
by: Tng, Jun Wei
Published: (2022)

Modelling and control of HVAC systems
by: Men, Bunnaroth
Published: (2022)

Modelling self-awareness in social robot
by: Zhang, Jiaheng
Published: (2020)

Explainable AI model for ECG signal assessment
by: Low, Stefanie Jing Ting
Published: (2023)

Laser beam attacks on lane detection models
by: Tay, Ryan Edward Siang An
Published: (2023)

Tiger detection using light and efficient models
by: Somdath Kshitij Agarwal
Published: (2023)

Time series prediction model for multiple applications
by: Yu, Ying Cheng
Published: (2023)

Distance metric learning for multi-modal image retrieval and annotation
by: Wu, Pengcheng
Published: (2014)