Topological analysis of protein structures with statistical learning

The study of Protein structure-function relationship has been of key focus in computational biology. A novel method of protein data analysis involves the use of Persistent Homology Analysis (PHA) as a tool for protein classification. The main method of feature engineering from intervals generated is...

Full description

Saved in:
Bibliographic Details
Main Author: Lee, Si Xian
Other Authors: PUN Chi Seng
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2021
Subjects:
Online Access:https://hdl.handle.net/10356/146104
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-146104
record_format dspace
spelling sg-ntu-dr.10356-1461042023-02-28T23:13:20Z Topological analysis of protein structures with statistical learning Lee, Si Xian PUN Chi Seng Xia Kelin School of Physical and Mathematical Sciences cspun@ntu.edu.sg, xiakelin@ntu.edu.sg Science::Biological sciences::Molecular biology Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence The study of Protein structure-function relationship has been of key focus in computational biology. A novel method of protein data analysis involves the use of Persistent Homology Analysis (PHA) as a tool for protein classification. The main method of feature engineering from intervals generated is using systematic approach of binning to characterise topological features. These features are then applied into 3 types of statistical learning methods: SVM, Tree-based methods and Neural Networks. Protein classification tasks used include: classification of hemoglobin molecules in relaxed and taut form (task 1) or the identification of all alpha, all beta and alpha-beta protein domains carried out on 450 and 900 proteins samples (task 2 and 3 respectively). The used of modified tree-based approach showed surprisingly stable results that attained the highest overall accuracy of 93.3% and 87.8% for task 2 and 3 respectively. Bachelor of Science in Mathematical Sciences 2021-01-26T08:40:07Z 2021-01-26T08:40:07Z 2018 Final Year Project (FYP) https://hdl.handle.net/10356/146104 en application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Science::Biological sciences::Molecular biology
Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
spellingShingle Science::Biological sciences::Molecular biology
Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Lee, Si Xian
Topological analysis of protein structures with statistical learning
description The study of Protein structure-function relationship has been of key focus in computational biology. A novel method of protein data analysis involves the use of Persistent Homology Analysis (PHA) as a tool for protein classification. The main method of feature engineering from intervals generated is using systematic approach of binning to characterise topological features. These features are then applied into 3 types of statistical learning methods: SVM, Tree-based methods and Neural Networks. Protein classification tasks used include: classification of hemoglobin molecules in relaxed and taut form (task 1) or the identification of all alpha, all beta and alpha-beta protein domains carried out on 450 and 900 proteins samples (task 2 and 3 respectively). The used of modified tree-based approach showed surprisingly stable results that attained the highest overall accuracy of 93.3% and 87.8% for task 2 and 3 respectively.
author2 PUN Chi Seng
author_facet PUN Chi Seng
Lee, Si Xian
format Final Year Project
author Lee, Si Xian
author_sort Lee, Si Xian
title Topological analysis of protein structures with statistical learning
title_short Topological analysis of protein structures with statistical learning
title_full Topological analysis of protein structures with statistical learning
title_fullStr Topological analysis of protein structures with statistical learning
title_full_unstemmed Topological analysis of protein structures with statistical learning
title_sort topological analysis of protein structures with statistical learning
publisher Nanyang Technological University
publishDate 2021
url https://hdl.handle.net/10356/146104
_version_ 1759854509412057088