Identification of clathrin proteins by incorporating hyperparameter optimization in deep learning and PSSM profiles

Background and Objectives: Clathrin is an adaptor protein that serves as the principal element of the vesicle-coating complex and is important for the membrane cleavage to dispense the invaginated vesicle from the plasma membrane. The functional loss of clathrins has been tied to a lot of human dise...

Full description

Saved in:
Bibliographic Details
Main Authors: Le, Nguyen Quoc Khanh, Huynh, Tuan-Tu, Yapp, Edward Kien Yee, Yeh, Hui-Yuan
Other Authors: School of Humanities
Format: Article
Language:English
Published: 2020
Subjects:
Online Access:https://hdl.handle.net/10356/143870
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-143870
record_format dspace
spelling sg-ntu-dr.10356-1438702020-09-28T06:58:06Z Identification of clathrin proteins by incorporating hyperparameter optimization in deep learning and PSSM profiles Le, Nguyen Quoc Khanh Huynh, Tuan-Tu Yapp, Edward Kien Yee Yeh, Hui-Yuan School of Humanities Humanities::General Clathrin Coated Pits Convolutional Neural Network Background and Objectives: Clathrin is an adaptor protein that serves as the principal element of the vesicle-coating complex and is important for the membrane cleavage to dispense the invaginated vesicle from the plasma membrane. The functional loss of clathrins has been tied to a lot of human diseases, i.e., neurodegenerative disorders, cancer, Alzheimer's diseases, and so on. Therefore, creating a precise model to identify its functions is a crucial step towards understanding human diseases and designing drug targets. Methods:We present a deep learning model using a two-dimensional convolutional neural network (CNN) and position-specific scoring matrix (PSSM) profiles to identify clathrin proteins from high throughput sequences. Traditionally, the 2D CNNs take images as an input so we treated the PSSM profile with a 20 × 20 matrix as an image of 20 × 20 pixels. The input PSSM profile was then connected to our 2D CNN in which we set a variety of parameters to improve the performance of the model. Based on the 10-fold cross-validation results, hyper-parameter optimization process was employed to find the best model for our dataset. Finally, an independent dataset was used to assess the predictive ability of the current model.Results:Our model could identify clathrin proteins with sensitivity of 92.2%, specificity of 91.2%, accuracy of 91.8%, and MCC of 0.83 in the independent dataset. Compared to state-of-the-art traditional neural networks, our method achieved a significant improvement in all typical measurement metrics. Conclusions:Throughout the proposed study, we provide an effective tool for investigating clathrin proteins and our achievement could promote the use of deep learning in biomedical research. We also provide source codes and dataset freely at https://www.github.com/khanhlee/deep-clathrin/. Accepted version 2020-09-28T06:58:06Z 2020-09-28T06:58:06Z 2019 Journal Article Le, N. Q. K., Huynh, T.-T., Yapp, E. K. Y., & Yeh, H.-Y. (2019). Identification of clathrin proteins by incorporating hyperparameter optimization in deep learning and PSSM profiles. Computer Methods and Programs in Biomedicine, 177, 81–88. doi:10.1016/j.cmpb.2019.05.016 0169-2607 https://hdl.handle.net/10356/143870 10.1016/j.cmpb.2019.05.016 31319963 177 81 88 en Computer methods and programs in biomedicine © 2019 Elsevier B.V. All rights reserved. This paper was published in Computer Methods and Programs in Biomedicine and is made available with permission of Elsevier B.V. application/pdf
institution Nanyang Technological University
building NTU Library
country Singapore
collection DR-NTU
language English
topic Humanities::General
Clathrin Coated Pits
Convolutional Neural Network
spellingShingle Humanities::General
Clathrin Coated Pits
Convolutional Neural Network
Le, Nguyen Quoc Khanh
Huynh, Tuan-Tu
Yapp, Edward Kien Yee
Yeh, Hui-Yuan
Identification of clathrin proteins by incorporating hyperparameter optimization in deep learning and PSSM profiles
description Background and Objectives: Clathrin is an adaptor protein that serves as the principal element of the vesicle-coating complex and is important for the membrane cleavage to dispense the invaginated vesicle from the plasma membrane. The functional loss of clathrins has been tied to a lot of human diseases, i.e., neurodegenerative disorders, cancer, Alzheimer's diseases, and so on. Therefore, creating a precise model to identify its functions is a crucial step towards understanding human diseases and designing drug targets. Methods:We present a deep learning model using a two-dimensional convolutional neural network (CNN) and position-specific scoring matrix (PSSM) profiles to identify clathrin proteins from high throughput sequences. Traditionally, the 2D CNNs take images as an input so we treated the PSSM profile with a 20 × 20 matrix as an image of 20 × 20 pixels. The input PSSM profile was then connected to our 2D CNN in which we set a variety of parameters to improve the performance of the model. Based on the 10-fold cross-validation results, hyper-parameter optimization process was employed to find the best model for our dataset. Finally, an independent dataset was used to assess the predictive ability of the current model.Results:Our model could identify clathrin proteins with sensitivity of 92.2%, specificity of 91.2%, accuracy of 91.8%, and MCC of 0.83 in the independent dataset. Compared to state-of-the-art traditional neural networks, our method achieved a significant improvement in all typical measurement metrics. Conclusions:Throughout the proposed study, we provide an effective tool for investigating clathrin proteins and our achievement could promote the use of deep learning in biomedical research. We also provide source codes and dataset freely at https://www.github.com/khanhlee/deep-clathrin/.
author2 School of Humanities
author_facet School of Humanities
Le, Nguyen Quoc Khanh
Huynh, Tuan-Tu
Yapp, Edward Kien Yee
Yeh, Hui-Yuan
format Article
author Le, Nguyen Quoc Khanh
Huynh, Tuan-Tu
Yapp, Edward Kien Yee
Yeh, Hui-Yuan
author_sort Le, Nguyen Quoc Khanh
title Identification of clathrin proteins by incorporating hyperparameter optimization in deep learning and PSSM profiles
title_short Identification of clathrin proteins by incorporating hyperparameter optimization in deep learning and PSSM profiles
title_full Identification of clathrin proteins by incorporating hyperparameter optimization in deep learning and PSSM profiles
title_fullStr Identification of clathrin proteins by incorporating hyperparameter optimization in deep learning and PSSM profiles
title_full_unstemmed Identification of clathrin proteins by incorporating hyperparameter optimization in deep learning and PSSM profiles
title_sort identification of clathrin proteins by incorporating hyperparameter optimization in deep learning and pssm profiles
publishDate 2020
url https://hdl.handle.net/10356/143870
_version_ 1681057981683204096