Self-supervised learning for early detection of neurodegenerative diseases with small data

Neurodegenerative diseases are one of the leading causes of disability in the world. They are chronic diseases where patients experience irreversible depletion of neurons in the brain. The most common neurodegenerative diseases are Alzheimer’s disease (AD) and Parkinson’s disease (PD), where patient...

全面介紹

Saved in:

書目詳細資料
主要作者:	Jiang, Hongchao
其他作者:	Miao Chun Yan
格式:	Thesis-Doctor of Philosophy
語言:	English
出版:	Nanyang Technological University 2023
主題:	Engineering::Computer science and engineering
在線閱讀:	https://hdl.handle.net/10356/166402
標簽:	添加標簽沒有標簽, 成為第一個標記此記錄!
機構:	Nanyang Technological University
語言:	English

id	sg-ntu-dr.10356-166402
record_format	dspace
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Computer science and engineering
spellingShingle	Engineering::Computer science and engineering Jiang, Hongchao Self-supervised learning for early detection of neurodegenerative diseases with small data
description	Neurodegenerative diseases are one of the leading causes of disability in the world. They are chronic diseases where patients experience irreversible depletion of neurons in the brain. The most common neurodegenerative diseases are Alzheimer’s disease (AD) and Parkinson’s disease (PD), where patients suffer from cognitive and motor deficiencies respectively. There is currently no cure available, and the condition progressively deteriorates affecting activities of daily living. Therefore, early intervention is crucial to alleviating symptoms and improving quality of life. Automated diagnosis tools have emerged as a viable option to diagnose patients more frequently and objectively. However, data acquisition is particularly challenging due to reasons like the rareness of disease or privacy legislation. Training a model from scratch on small data is prone to overfitting on trivial patterns. Pre-training is a promising approach where the model learns general knowledge from readily available data before fine-tuning on the small data to learn task-specific knowledge. Supervised pre-training requires the data to be labeled, which is costly when expert annotators (e.g., clinicians) are involved. Self-supervised pre-training on the other hand uses unlabeled data. In this thesis, we explore how pre-training can be used to improve the early detection of neurodegenerative diseases in both clinical and non-clinical settings. Early detection in a non-clinical setting involves a variety of biomarkers collected from wearables or mobile devices. A large number of applications (e.g., mobile-based assessments, serious games) have been developed to encourage more frequent testing. Maximizing the utility of these applications requires their prompt deployment, but collecting sufficient data to train the underlying models is often challenging. We propose an approach that does not involve any data collection. We developed a mobile-based clock drawing test that enables patients to perform the clinical cognitive assessment task at home in an automated manner. Specifically, synthetic clock drawings are generated to pre-train an object detection model for detecting hand-drawn clock components. A rule-based classifier is then used to score the quality of the clock drawn based on clinical assessment criteria. We also explore non-clinical applications for PD, such as a mobile-based gait assessment application. Instead of using a rule-based approach, we propose a data-driven general framework that is more scalable. We leverage the fact that assessment tasks are designed to amplify differences from healthy subjects. Therefore, we can formulate the problem as an anomaly detection task and model how healthy a data sample is. We assume that we can collect large amounts of data from healthy subjects, which is not difficult to accomplish due to the pervasiveness of smartphones and crowdsourcing techniques. We first pre-train the model on data samples from healthy subjects using self-supervised learning to obtain a good feature extractor. For the fine-tuning stage, we pull the pre-trained features close together in latent space to extract common patterns representative of healthiness. Our approach is able to discriminate between healthy and PD samples despite not seeing any PD samples during training. In a clinical setting, biomarkers are less varied (e.g., brain scans, neuropsychological tests). Both labeled and unlabeled data are available for pre-training or fine-tuning as hospitals typically keep some form of clinical records. However, acquiring more data is not as easy as in the non-clinical setting. For example, acquiring large amounts of MRI scans is cost-prohibitive. Therefore, the challenge is to maximize performance with only a small pre-training dataset. We study this problem in the context of differentiating 3D MRI scans from mild cognitive impairment and prodromal AD subjects. We propose a hybrid approach of self-supervised pre-training followed by multitask learning to effectively make use of labeled and unlabeled MRI scans from the AD spectrum. Recent works have shown that self-supervised pre-training mainly learns low-level features and struggles to learn high-level features, which are crucial for identifying anatomic atrophy patterns in MRI data. To address this limitation, we propose an Anatomy-Aware Gating Network (AAGN) that directly encodes the knowledge of brain anatomy as a form of inductive bias in the model. AAGN outperforms self-supervised learning methods when trained from scratch on small data.
author2	Miao Chun Yan
author_facet	Miao Chun Yan Jiang, Hongchao
format	Thesis-Doctor of Philosophy
author	Jiang, Hongchao
author_sort	Jiang, Hongchao
title	Self-supervised learning for early detection of neurodegenerative diseases with small data
title_short	Self-supervised learning for early detection of neurodegenerative diseases with small data
title_full	Self-supervised learning for early detection of neurodegenerative diseases with small data
title_fullStr	Self-supervised learning for early detection of neurodegenerative diseases with small data
title_full_unstemmed	Self-supervised learning for early detection of neurodegenerative diseases with small data
title_sort	self-supervised learning for early detection of neurodegenerative diseases with small data
publisher	Nanyang Technological University
publishDate	2023
url	https://hdl.handle.net/10356/166402
_version_	1765213863005913088
spelling	sg-ntu-dr.10356-1664022023-05-02T06:33:01Z Self-supervised learning for early detection of neurodegenerative diseases with small data Jiang, Hongchao Miao Chun Yan Interdisciplinary Graduate School (IGS) Alibaba-NTU Singapore Joint Research Institute (JRI) ASCYMiao@ntu.edu.sg Engineering::Computer science and engineering Neurodegenerative diseases are one of the leading causes of disability in the world. They are chronic diseases where patients experience irreversible depletion of neurons in the brain. The most common neurodegenerative diseases are Alzheimer’s disease (AD) and Parkinson’s disease (PD), where patients suffer from cognitive and motor deficiencies respectively. There is currently no cure available, and the condition progressively deteriorates affecting activities of daily living. Therefore, early intervention is crucial to alleviating symptoms and improving quality of life. Automated diagnosis tools have emerged as a viable option to diagnose patients more frequently and objectively. However, data acquisition is particularly challenging due to reasons like the rareness of disease or privacy legislation. Training a model from scratch on small data is prone to overfitting on trivial patterns. Pre-training is a promising approach where the model learns general knowledge from readily available data before fine-tuning on the small data to learn task-specific knowledge. Supervised pre-training requires the data to be labeled, which is costly when expert annotators (e.g., clinicians) are involved. Self-supervised pre-training on the other hand uses unlabeled data. In this thesis, we explore how pre-training can be used to improve the early detection of neurodegenerative diseases in both clinical and non-clinical settings. Early detection in a non-clinical setting involves a variety of biomarkers collected from wearables or mobile devices. A large number of applications (e.g., mobile-based assessments, serious games) have been developed to encourage more frequent testing. Maximizing the utility of these applications requires their prompt deployment, but collecting sufficient data to train the underlying models is often challenging. We propose an approach that does not involve any data collection. We developed a mobile-based clock drawing test that enables patients to perform the clinical cognitive assessment task at home in an automated manner. Specifically, synthetic clock drawings are generated to pre-train an object detection model for detecting hand-drawn clock components. A rule-based classifier is then used to score the quality of the clock drawn based on clinical assessment criteria. We also explore non-clinical applications for PD, such as a mobile-based gait assessment application. Instead of using a rule-based approach, we propose a data-driven general framework that is more scalable. We leverage the fact that assessment tasks are designed to amplify differences from healthy subjects. Therefore, we can formulate the problem as an anomaly detection task and model how healthy a data sample is. We assume that we can collect large amounts of data from healthy subjects, which is not difficult to accomplish due to the pervasiveness of smartphones and crowdsourcing techniques. We first pre-train the model on data samples from healthy subjects using self-supervised learning to obtain a good feature extractor. For the fine-tuning stage, we pull the pre-trained features close together in latent space to extract common patterns representative of healthiness. Our approach is able to discriminate between healthy and PD samples despite not seeing any PD samples during training. In a clinical setting, biomarkers are less varied (e.g., brain scans, neuropsychological tests). Both labeled and unlabeled data are available for pre-training or fine-tuning as hospitals typically keep some form of clinical records. However, acquiring more data is not as easy as in the non-clinical setting. For example, acquiring large amounts of MRI scans is cost-prohibitive. Therefore, the challenge is to maximize performance with only a small pre-training dataset. We study this problem in the context of differentiating 3D MRI scans from mild cognitive impairment and prodromal AD subjects. We propose a hybrid approach of self-supervised pre-training followed by multitask learning to effectively make use of labeled and unlabeled MRI scans from the AD spectrum. Recent works have shown that self-supervised pre-training mainly learns low-level features and struggles to learn high-level features, which are crucial for identifying anatomic atrophy patterns in MRI data. To address this limitation, we propose an Anatomy-Aware Gating Network (AAGN) that directly encodes the knowledge of brain anatomy as a form of inductive bias in the model. AAGN outperforms self-supervised learning methods when trained from scratch on small data. Doctor of Philosophy 2023-04-27T06:13:55Z 2023-04-27T06:13:55Z 2023 Thesis-Doctor of Philosophy Jiang, H. (2023). Self-supervised learning for early detection of neurodegenerative diseases with small data. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/166402 https://hdl.handle.net/10356/166402 10.32657/10356/166402 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University

Self-supervised learning for early detection of neurodegenerative diseases with small data

相似書籍