SWIN transformer for diabetic retinopathy detection
In the field of Machine Learning, Convolutional Neural Networks (CNNs) have been dominant in executing image classification tasks. Transformer models were first introduced in 2017 for Natural Language Processing tasks, where further development led to the introduction of Vision Transformers (ViTs) f...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2023
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/165923 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | In the field of Machine Learning, Convolutional Neural Networks (CNNs) have been dominant in executing image classification tasks. Transformer models were first introduced in 2017 for Natural Language Processing tasks, where further development led to the introduction of Vision Transformers (ViTs) for image classification. In the medical field, one of the many use cases of Artificial Intelligence is to detect diseases. Specific to eye diseases, Diabetic Retinopathy (DR) is a common disease that has been using CNNs to aid in its discovery or classification. While recent comparisons have shown that ViTs outperform CNNs on the ImageNet, none has been done on a DR dataset. In this paper, we aim to compare the performances of ViTs and its recent variants, and CNNs on detecting DR using a single standardized dataset. The dataset used for training is obtained from Kaggle, and there are two other separate external validation datasets. We demonstrate that the SWIN Transformer outperforms other architectures in this problem. |
---|