Machine learning methods for classification of COVID-19 exploiting infrared spectroscopy
COVID-19 has the characteristics of diverse transmission routes and a long incubation period and can spread to a large area in a short period. Therefore, rapid COVID-19 testing is crucial. In this dissertation, we develop machine learning methods for the classification of the infrared spectra of COV...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Master by Coursework |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/173801 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | COVID-19 has the characteristics of diverse transmission routes and a long incubation period and can spread to a large area in a short period. Therefore, rapid COVID-19 testing is crucial. In this dissertation, we develop machine learning methods for the classification of the infrared spectra of COVID-19 pharyngeal swab samples. Because of the high dimension of infrared spectral data, it is difficult to extract the key features, and it also brings a large amount of calculation. Therefore, using feature selection and feature transformation to reduce the dimensionality of the original data is a key step. In this dissertation, dimensionality reduction methods were selected and compared for two batches of datasets, and COVID-19 detection models were established with machine learning methods. For the first batch of data, the competitive adaptive reweighted sampling-principal component analysis-support vector machine (CARS-PCA-SVM) model reduces the dimensionality of the original dataset to 74 dimensions and achieves the best classification performance, with an accuracy of 83.33%, a sensitivity of 86.75%, and a specificity of 82.29%. In contrast, the genetic algorithm-support vector machine (GA-SVM) model only achieves an accuracy of 71.93%. For the second batch of data, the PCA-SVM model is able to reduce the dimensionality of the original spectral data to 36 dimensions, while achieving the best classification performance, with an accuracy of 96.68%, a sensitivity of 95.61%, and a specificity of 97.69%. In contrast, the successive projections algorithm-support vector machine (SPA-SVM) model only achieves an accuracy of 88.44%. |
---|