SWIN transformer for diabetic retinopathy detection

In the field of Machine Learning, Convolutional Neural Networks (CNNs) have been dominant in executing image classification tasks. Transformer models were first introduced in 2017 for Natural Language Processing tasks, where further development led to the introduction of Vision Transformers (ViTs) f...

全面介紹

Saved in:
書目詳細資料
主要作者: Ang, Elroy Wei Yong
其他作者: Jagath C Rajapakse
格式: Final Year Project
語言:English
出版: Nanyang Technological University 2023
主題:
在線閱讀:https://hdl.handle.net/10356/165923
標簽: 添加標簽
沒有標簽, 成為第一個標記此記錄!
機構: Nanyang Technological University
語言: English
實物特徵
總結:In the field of Machine Learning, Convolutional Neural Networks (CNNs) have been dominant in executing image classification tasks. Transformer models were first introduced in 2017 for Natural Language Processing tasks, where further development led to the introduction of Vision Transformers (ViTs) for image classification. In the medical field, one of the many use cases of Artificial Intelligence is to detect diseases. Specific to eye diseases, Diabetic Retinopathy (DR) is a common disease that has been using CNNs to aid in its discovery or classification. While recent comparisons have shown that ViTs outperform CNNs on the ImageNet, none has been done on a DR dataset. In this paper, we aim to compare the performances of ViTs and its recent variants, and CNNs on detecting DR using a single standardized dataset. The dataset used for training is obtained from Kaggle, and there are two other separate external validation datasets. We demonstrate that the SWIN Transformer outperforms other architectures in this problem.