Multiresolution persistent homology for excessively large biomolecular datasets
Although persistent homology has emerged as a promising tool for the topological simplification of complex data, it is computationally intractable for large datasets. We introduce multiresolution persistent homology to handle excessively large datasets. We match the resolution with the scale of inte...
Saved in:
Main Authors: | , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2016
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/82117 http://hdl.handle.net/10220/41115 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-82117 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-821172023-02-28T19:32:26Z Multiresolution persistent homology for excessively large biomolecular datasets Xia, Kelin Zhao, Zhixiong Wei, Guo-Wei School of Physical and Mathematical Sciences Proteins Multiscale methods Although persistent homology has emerged as a promising tool for the topological simplification of complex data, it is computationally intractable for large datasets. We introduce multiresolution persistent homology to handle excessively large datasets. We match the resolution with the scale of interest so as to represent large scale datasets with appropriate resolution. We utilize flexibility-rigidity index to access the topological connectivity of the data set and define a rigidity density for the filtration analysis. By appropriately tuning the resolution of the rigidity density, we are able to focus the topological lens on the scale of interest. The proposed multiresolution topological analysis is validated by a hexagonal fractal image which has three distinct scales. We further demonstrate the proposed method for extracting topological fingerprints from DNA molecules. In particular, the topological persistence of a virus capsid with 273 780 atoms is successfully analyzed which would otherwise be inaccessible to the normal point cloud method and unreliable by using coarse-grained multiscale persistent homology. The proposed method has also been successfully applied to the protein domain classification, which is the first time that persistent homology is used for practical protein domain analysis, to our knowledge. The proposed multiresolution topological method has potential applications in arbitrary data sets, such as social networks, biological networks, and graphs. Published version 2016-08-10T05:54:20Z 2019-12-06T14:46:59Z 2016-08-10T05:54:20Z 2019-12-06T14:46:59Z 2015 Journal Article Xia, K., Zhao, Z., & Wei, G.-W. (2015). Multiresolution persistent homology for excessively large biomolecular datasets. The Journal of Chemical Physics, 143(13), 134103-. 0021-9606 https://hdl.handle.net/10356/82117 http://hdl.handle.net/10220/41115 10.1063/1.4931733 26450288 en The Journal of Chemical Physics © 2015 American Institute of Physics. This paper was published in The Journal of Chemical Physics and is made available as an electronic reprint (preprint) with permission of American Institute of Physics. The published version is available at: [http://dx.doi.org/10.1063/1.4931733]. One print or electronic copy may be made for personal use only. Systematic or multiple reproduction, distribution to multiple locations via electronic or other means, duplication of any material in this paper for a fee or for commercial purposes, or modification of the content of the paper is prohibited and is subject to penalties under law. 12 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Proteins Multiscale methods |
spellingShingle |
Proteins Multiscale methods Xia, Kelin Zhao, Zhixiong Wei, Guo-Wei Multiresolution persistent homology for excessively large biomolecular datasets |
description |
Although persistent homology has emerged as a promising tool for the topological simplification of complex data, it is computationally intractable for large datasets. We introduce multiresolution persistent homology to handle excessively large datasets. We match the resolution with the scale of interest so as to represent large scale datasets with appropriate resolution. We utilize flexibility-rigidity index to access the topological connectivity of the data set and define a rigidity density for the filtration analysis. By appropriately tuning the resolution of the rigidity density, we are able to focus the topological lens on the scale of interest. The proposed multiresolution topological analysis is validated by a hexagonal fractal image which has three distinct scales. We further demonstrate the proposed method for extracting topological fingerprints from DNA molecules. In particular, the topological persistence of a virus capsid with 273 780 atoms is successfully analyzed which would otherwise be inaccessible to the normal point cloud method and unreliable by using coarse-grained multiscale persistent homology. The proposed method has also been successfully applied to the protein domain classification, which is the first time that persistent homology is used for practical protein domain analysis, to our knowledge. The proposed multiresolution topological method has potential applications in arbitrary data sets, such as social networks, biological networks, and graphs. |
author2 |
School of Physical and Mathematical Sciences |
author_facet |
School of Physical and Mathematical Sciences Xia, Kelin Zhao, Zhixiong Wei, Guo-Wei |
format |
Article |
author |
Xia, Kelin Zhao, Zhixiong Wei, Guo-Wei |
author_sort |
Xia, Kelin |
title |
Multiresolution persistent homology for excessively large biomolecular datasets |
title_short |
Multiresolution persistent homology for excessively large biomolecular datasets |
title_full |
Multiresolution persistent homology for excessively large biomolecular datasets |
title_fullStr |
Multiresolution persistent homology for excessively large biomolecular datasets |
title_full_unstemmed |
Multiresolution persistent homology for excessively large biomolecular datasets |
title_sort |
multiresolution persistent homology for excessively large biomolecular datasets |
publishDate |
2016 |
url |
https://hdl.handle.net/10356/82117 http://hdl.handle.net/10220/41115 |
_version_ |
1759854180322770944 |