Persistent-homology-based machine learning: a survey and a comparative study
A suitable feature representation that can both preserve the data intrinsic information and reduce data complexity and dimensionality is key to the performance of machine learning models. Deeply rooted in algebraic topology, persistent homology (PH) provides a delicate balance between data simplific...
Saved in:
Main Authors: | , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2022
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/161923 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-161923 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1619232022-09-26T06:46:14Z Persistent-homology-based machine learning: a survey and a comparative study Pun, Chi Seng Lee, Si Xian Xia, Kelin School of Physical and Mathematical Sciences Science::Mathematics Persistent Homology Machine Learning A suitable feature representation that can both preserve the data intrinsic information and reduce data complexity and dimensionality is key to the performance of machine learning models. Deeply rooted in algebraic topology, persistent homology (PH) provides a delicate balance between data simplification and intrinsic structure characterization, and has been applied to various areas successfully. However, the combination of PH and machine learning has been hindered greatly by three challenges, namely topological representation of data, PH-based distance measurements or metrics, and PH-based feature representation. With the development of topological data analysis, progresses have been made on all these three problems, but widely scattered in different literatures. In this paper, we provide a systematical review of PH and PH-based supervised and unsupervised models from a computational perspective. Our emphasizes are the recent development of mathematical models and tools, including PH software and PH-based functions, feature representations, kernels, and similarity models. Essentially, this paper can work as a roadmap for the practical application of PH-based machine learning tools. Further, we compare between two types of simplicial complexes (alpha and Vietrois-Rips complexes), two types of feature extractions (barcode statistics and binned features), and three types of machine learning models (support vector machines, tree-based models, and neural networks), and investigate their impacts on the protein secondary structure classification. Ministry of Education (MOE) Nanyang Technological University This research is partially supported by Nanyang Technological University Startup Grants M4081840 and M4081842, Data Science and Artificial Intelligence Research Centre@NTU M4082115, and Singapore Ministry of Education Academic Research Fund Tier 1 RG109/19, Tier 2 MOE2018-T2-1-033 and MOE-T2EP20120-0013. 2022-09-26T06:46:14Z 2022-09-26T06:46:14Z 2022 Journal Article Pun, C. S., Lee, S. X. & Xia, K. (2022). Persistent-homology-based machine learning: a survey and a comparative study. Artificial Intelligence Review, 55(7), 5169-5213. https://dx.doi.org/10.1007/s10462-022-10146-z 0269-2821 https://hdl.handle.net/10356/161923 10.1007/s10462-022-10146-z 2-s2.0-85124822510 7 55 5169 5213 en M4081840 M4081842 M4082115 RG109/19 MOE2018-T2-1-033 MOE-T2EP20120-0013 Artificial Intelligence Review © 2022 The Author(s), under exclusive licence to Springer Nature B.V. |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Science::Mathematics Persistent Homology Machine Learning |
spellingShingle |
Science::Mathematics Persistent Homology Machine Learning Pun, Chi Seng Lee, Si Xian Xia, Kelin Persistent-homology-based machine learning: a survey and a comparative study |
description |
A suitable feature representation that can both preserve the data intrinsic information and reduce data complexity and dimensionality is key to the performance of machine learning models. Deeply rooted in algebraic topology, persistent homology (PH) provides a delicate balance between data simplification and intrinsic structure characterization, and has been applied to various areas successfully. However, the combination of PH and machine learning has been hindered greatly by three challenges, namely topological representation of data, PH-based distance measurements or metrics, and PH-based feature representation. With the development of topological data analysis, progresses have been made on all these three problems, but widely scattered in different literatures. In this paper, we provide a systematical review of PH and PH-based supervised and unsupervised models from a computational perspective. Our emphasizes are the recent development of mathematical models and tools, including PH software and PH-based functions, feature representations, kernels, and similarity models. Essentially, this paper can work as a roadmap for the practical application of PH-based machine learning tools. Further, we compare between two types of simplicial complexes (alpha and Vietrois-Rips complexes), two types of feature extractions (barcode statistics and binned features), and three types of machine learning models (support vector machines, tree-based models, and neural networks), and investigate their impacts on the protein secondary structure classification. |
author2 |
School of Physical and Mathematical Sciences |
author_facet |
School of Physical and Mathematical Sciences Pun, Chi Seng Lee, Si Xian Xia, Kelin |
format |
Article |
author |
Pun, Chi Seng Lee, Si Xian Xia, Kelin |
author_sort |
Pun, Chi Seng |
title |
Persistent-homology-based machine learning: a survey and a comparative study |
title_short |
Persistent-homology-based machine learning: a survey and a comparative study |
title_full |
Persistent-homology-based machine learning: a survey and a comparative study |
title_fullStr |
Persistent-homology-based machine learning: a survey and a comparative study |
title_full_unstemmed |
Persistent-homology-based machine learning: a survey and a comparative study |
title_sort |
persistent-homology-based machine learning: a survey and a comparative study |
publishDate |
2022 |
url |
https://hdl.handle.net/10356/161923 |
_version_ |
1745574638619983872 |