Topological analysis of protein structures with statistical learning
The study of Protein structure-function relationship has been of key focus in computational biology. A novel method of protein data analysis involves the use of Persistent Homology Analysis (PHA) as a tool for protein classification. The main method of feature engineering from intervals generated is...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2021
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/146104 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-146104 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1461042023-02-28T23:13:20Z Topological analysis of protein structures with statistical learning Lee, Si Xian PUN Chi Seng Xia Kelin School of Physical and Mathematical Sciences cspun@ntu.edu.sg, xiakelin@ntu.edu.sg Science::Biological sciences::Molecular biology Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence The study of Protein structure-function relationship has been of key focus in computational biology. A novel method of protein data analysis involves the use of Persistent Homology Analysis (PHA) as a tool for protein classification. The main method of feature engineering from intervals generated is using systematic approach of binning to characterise topological features. These features are then applied into 3 types of statistical learning methods: SVM, Tree-based methods and Neural Networks. Protein classification tasks used include: classification of hemoglobin molecules in relaxed and taut form (task 1) or the identification of all alpha, all beta and alpha-beta protein domains carried out on 450 and 900 proteins samples (task 2 and 3 respectively). The used of modified tree-based approach showed surprisingly stable results that attained the highest overall accuracy of 93.3% and 87.8% for task 2 and 3 respectively. Bachelor of Science in Mathematical Sciences 2021-01-26T08:40:07Z 2021-01-26T08:40:07Z 2018 Final Year Project (FYP) https://hdl.handle.net/10356/146104 en application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Science::Biological sciences::Molecular biology Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence |
spellingShingle |
Science::Biological sciences::Molecular biology Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Lee, Si Xian Topological analysis of protein structures with statistical learning |
description |
The study of Protein structure-function relationship has been of key focus in computational biology. A novel method of protein data analysis involves the use of Persistent Homology Analysis (PHA) as a tool for protein classification. The main method of feature engineering from intervals generated is using systematic approach of binning to characterise topological features. These features are then applied into 3 types of statistical learning methods: SVM, Tree-based methods and Neural Networks. Protein classification tasks used include: classification of hemoglobin molecules in relaxed and taut form (task 1) or the identification of all alpha, all beta and alpha-beta protein domains carried out on 450 and 900 proteins samples (task 2 and 3 respectively). The used of modified tree-based approach showed surprisingly stable results that attained the highest overall accuracy of 93.3% and 87.8% for task 2 and 3 respectively. |
author2 |
PUN Chi Seng |
author_facet |
PUN Chi Seng Lee, Si Xian |
format |
Final Year Project |
author |
Lee, Si Xian |
author_sort |
Lee, Si Xian |
title |
Topological analysis of protein structures with statistical learning |
title_short |
Topological analysis of protein structures with statistical learning |
title_full |
Topological analysis of protein structures with statistical learning |
title_fullStr |
Topological analysis of protein structures with statistical learning |
title_full_unstemmed |
Topological analysis of protein structures with statistical learning |
title_sort |
topological analysis of protein structures with statistical learning |
publisher |
Nanyang Technological University |
publishDate |
2021 |
url |
https://hdl.handle.net/10356/146104 |
_version_ |
1759854509412057088 |