Transformers for computer vision
Transformer models were initially introduced on natural language tasks based on the self-attention mechanism. They require minimal inductive biases on design and can be applied as individual processing layers in network design in network design. In recent years, transformer models are applied to pop...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Master by Coursework |
Language: | English |
Published: |
Nanyang Technological University
2022
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/154659 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-154659 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1546592023-07-04T16:38:15Z Transformers for computer vision Deng, Yaojun Wang Lipo School of Electrical and Electronic Engineering ELPWang@ntu.edu.sg Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision Transformer models were initially introduced on natural language tasks based on the self-attention mechanism. They require minimal inductive biases on design and can be applied as individual processing layers in network design in network design. In recent years, transformer models are applied to popular Computer Vision (CV) tasks and led to significant progress. Previous surveys introduced applications of transformers on different tasks (e.g., object detection, activity recognition, and image enhancement). In this dissertation, we focus on image classification and introduce several outstanding and representative improved vision transformer models. We conduct comparison and simulation between transformer models and several representative convolution neural network (CNN) models to illustrate the advantages and limitations of vision transformers in Computer Vision (CV) tasks. Master of Science (Signal Processing) 2022-01-03T07:35:26Z 2022-01-03T07:35:26Z 2021 Thesis-Master by Coursework Deng, Y. (2021). Transformers for computer vision. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/154659 https://hdl.handle.net/10356/154659 en ISM-DISS-02493 application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision |
spellingShingle |
Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision Deng, Yaojun Transformers for computer vision |
description |
Transformer models were initially introduced on natural language tasks based on the self-attention mechanism. They require minimal inductive biases on design and can be applied as individual processing layers in network design in network design. In recent years, transformer models are applied to popular Computer Vision (CV) tasks and led to significant progress. Previous surveys introduced applications of transformers on different tasks (e.g., object detection, activity recognition, and image enhancement). In this dissertation, we focus on image classification and introduce several outstanding and representative improved vision transformer models. We conduct comparison and simulation between transformer models and several representative convolution neural network (CNN) models to illustrate the advantages and limitations of vision transformers in Computer Vision (CV) tasks. |
author2 |
Wang Lipo |
author_facet |
Wang Lipo Deng, Yaojun |
format |
Thesis-Master by Coursework |
author |
Deng, Yaojun |
author_sort |
Deng, Yaojun |
title |
Transformers for computer vision |
title_short |
Transformers for computer vision |
title_full |
Transformers for computer vision |
title_fullStr |
Transformers for computer vision |
title_full_unstemmed |
Transformers for computer vision |
title_sort |
transformers for computer vision |
publisher |
Nanyang Technological University |
publishDate |
2022 |
url |
https://hdl.handle.net/10356/154659 |
_version_ |
1772827267067543552 |