On parsimony and clustering

This work is motivated by applications of parsimonious cladograms for the purpose of analyzing non-biological data. Parsimonious cladograms were introduced as a means to help understanding the tree of life, and are now used in fields related to biological sciences at large, e.g., to analyze viruses...

Full description

Saved in:
Bibliographic Details
Main Authors: Oggier, Frederique, Datta, Anwitaman
Other Authors: School of Computer Science and Engineering
Format: Article
Language:English
Published: 2023
Subjects:
Online Access:https://hdl.handle.net/10356/169292
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-169292
record_format dspace
spelling sg-ntu-dr.10356-1692922023-07-14T15:35:54Z On parsimony and clustering Oggier, Frederique Datta, Anwitaman School of Computer Science and Engineering School of Physical and Mathematical Sciences Engineering::Computer science and engineering Parsimony Clustering This work is motivated by applications of parsimonious cladograms for the purpose of analyzing non-biological data. Parsimonious cladograms were introduced as a means to help understanding the tree of life, and are now used in fields related to biological sciences at large, e.g., to analyze viruses or to predict the structure of proteins. We revisit parsimonious cladograms through the lens of clustering and compare cladograms optimized for parsimony with dendograms obtained from single linkage hierarchical clustering. We show that despite similarities in both approaches, there exist datasets whose clustering dendogram is incompatible with parsimony optimization. Furthermore, we provide numerical examples to compare via F-scores the clustering obtained through both parsimonious cladograms and single linkage hierarchical dendograms. Published version 2023-07-11T06:16:50Z 2023-07-11T06:16:50Z 2023 Journal Article Oggier, F. & Datta, A. (2023). On parsimony and clustering. PeerJ Computer Science, 9, e1339-. https://dx.doi.org/10.7717/peerj-cs.1339 2376-5992 https://hdl.handle.net/10356/169292 10.7717/peerj-cs.1339 37346541 9 e1339 en PeerJ Computer science © 2023 Oggier and Datta. Distributed under Creative Commons CC-BY 4.0. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Computer science and engineering
Parsimony
Clustering
spellingShingle Engineering::Computer science and engineering
Parsimony
Clustering
Oggier, Frederique
Datta, Anwitaman
On parsimony and clustering
description This work is motivated by applications of parsimonious cladograms for the purpose of analyzing non-biological data. Parsimonious cladograms were introduced as a means to help understanding the tree of life, and are now used in fields related to biological sciences at large, e.g., to analyze viruses or to predict the structure of proteins. We revisit parsimonious cladograms through the lens of clustering and compare cladograms optimized for parsimony with dendograms obtained from single linkage hierarchical clustering. We show that despite similarities in both approaches, there exist datasets whose clustering dendogram is incompatible with parsimony optimization. Furthermore, we provide numerical examples to compare via F-scores the clustering obtained through both parsimonious cladograms and single linkage hierarchical dendograms.
author2 School of Computer Science and Engineering
author_facet School of Computer Science and Engineering
Oggier, Frederique
Datta, Anwitaman
format Article
author Oggier, Frederique
Datta, Anwitaman
author_sort Oggier, Frederique
title On parsimony and clustering
title_short On parsimony and clustering
title_full On parsimony and clustering
title_fullStr On parsimony and clustering
title_full_unstemmed On parsimony and clustering
title_sort on parsimony and clustering
publishDate 2023
url https://hdl.handle.net/10356/169292
_version_ 1772828336381231104