Structome: a tool for the rapid assembly of datasets for structural phylogenetics

Protein structures carry signal of common ancestry and can therefore aid in reconstructing their evolutionary histories. To expedite the structure-informed inference process, a web server, Structome, has been developed that allows users to rapidly identify protein structures similar to a query prote...

Full description

Saved in:
Bibliographic Details
Main Authors: Malik, Ashar J., Langer, Desiree, Verma, Chandra Shekhar, Poole, Anthony M., Allison, Jane R.
Other Authors: School of Biological Sciences
Format: Article
Language:English
Published: 2024
Subjects:
Online Access:https://hdl.handle.net/10356/173974
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-173974
record_format dspace
spelling sg-ntu-dr.10356-1739742024-03-11T15:32:19Z Structome: a tool for the rapid assembly of datasets for structural phylogenetics Malik, Ashar J. Langer, Desiree Verma, Chandra Shekhar Poole, Anthony M. Allison, Jane R. School of Biological Sciences Bioinformatics Institute, A*STAR National University of Singapore Medicine, Health and Life Sciences Protein structures Datasets Protein structures carry signal of common ancestry and can therefore aid in reconstructing their evolutionary histories. To expedite the structure-informed inference process, a web server, Structome, has been developed that allows users to rapidly identify protein structures similar to a query protein and to assemble datasets useful for structure-based phylogenetics. Structome was created by clustering ∼94% of the structures in RCSB PDB using 90% sequence identity and representing each cluster by a centroid structure. Structure similarity between centroid proteins was calculated, and annotations from PDB, SCOP, and CATH were integrated. To illustrate utility, an H3 histone was used as a query, and results show that the protein structures returned by Structome span both sequence and structural diversity of the histone fold. Additionally, the pre-computed nexus-formatted distance matrix, provided by Structome, enables analysis of evolutionary relationships between proteins not identifiable using searches based on sequence similarity alone. Our results demonstrate that, beginning with a single structure, Structome can be used to rapidly generate a dataset of structural neighbours and allows deep evolutionary history of proteins to be studied. Published version 2024-03-08T05:41:02Z 2024-03-08T05:41:02Z 2023 Journal Article Malik, A. J., Langer, D., Verma, C. S., Poole, A. M. & Allison, J. R. (2023). Structome: a tool for the rapid assembly of datasets for structural phylogenetics. Bioinformatics Advances, 3(1), vbad134-. https://dx.doi.org/10.1093/bioadv/vbad134 2635-0041 https://hdl.handle.net/10356/173974 10.1093/bioadv/vbad134 38046099 2-s2.0-85180350270 1 3 vbad134 en Bioinformatics Advances © 2023 The Author(s). Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Medicine, Health and Life Sciences
Protein structures
Datasets
spellingShingle Medicine, Health and Life Sciences
Protein structures
Datasets
Malik, Ashar J.
Langer, Desiree
Verma, Chandra Shekhar
Poole, Anthony M.
Allison, Jane R.
Structome: a tool for the rapid assembly of datasets for structural phylogenetics
description Protein structures carry signal of common ancestry and can therefore aid in reconstructing their evolutionary histories. To expedite the structure-informed inference process, a web server, Structome, has been developed that allows users to rapidly identify protein structures similar to a query protein and to assemble datasets useful for structure-based phylogenetics. Structome was created by clustering ∼94% of the structures in RCSB PDB using 90% sequence identity and representing each cluster by a centroid structure. Structure similarity between centroid proteins was calculated, and annotations from PDB, SCOP, and CATH were integrated. To illustrate utility, an H3 histone was used as a query, and results show that the protein structures returned by Structome span both sequence and structural diversity of the histone fold. Additionally, the pre-computed nexus-formatted distance matrix, provided by Structome, enables analysis of evolutionary relationships between proteins not identifiable using searches based on sequence similarity alone. Our results demonstrate that, beginning with a single structure, Structome can be used to rapidly generate a dataset of structural neighbours and allows deep evolutionary history of proteins to be studied.
author2 School of Biological Sciences
author_facet School of Biological Sciences
Malik, Ashar J.
Langer, Desiree
Verma, Chandra Shekhar
Poole, Anthony M.
Allison, Jane R.
format Article
author Malik, Ashar J.
Langer, Desiree
Verma, Chandra Shekhar
Poole, Anthony M.
Allison, Jane R.
author_sort Malik, Ashar J.
title Structome: a tool for the rapid assembly of datasets for structural phylogenetics
title_short Structome: a tool for the rapid assembly of datasets for structural phylogenetics
title_full Structome: a tool for the rapid assembly of datasets for structural phylogenetics
title_fullStr Structome: a tool for the rapid assembly of datasets for structural phylogenetics
title_full_unstemmed Structome: a tool for the rapid assembly of datasets for structural phylogenetics
title_sort structome: a tool for the rapid assembly of datasets for structural phylogenetics
publishDate 2024
url https://hdl.handle.net/10356/173974
_version_ 1794549349876760576