Targeted sequencing analysis pipeline for species identification of human pathogenic fungi using long-read nanopore sequencing

Among molecular-based techniques for fungal identification, Sanger sequencing of the primary universal fungal DNA barcode, the internal transcribed spacer (ITS) region (ITS1, 5.8S, ITS2), is commonly used in clinical routine laboratories due to its simplicity, universality, efficacy, and affordabili...

Full description

Saved in:

Bibliographic Details
Main Author:	Langsiri N.
Other Authors:	Mahidol University
Format:	Article
Published:	2023
Subjects:	Agricultural and Biological Sciences
Online Access:	https://repository.li.mahidol.ac.th/handle/123456789/90009
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Mahidol University

id	th-mahidol.90009
record_format	dspace
spelling	th-mahidol.900092023-09-16T01:00:52Z Targeted sequencing analysis pipeline for species identification of human pathogenic fungi using long-read nanopore sequencing Langsiri N. Mahidol University Agricultural and Biological Sciences Among molecular-based techniques for fungal identification, Sanger sequencing of the primary universal fungal DNA barcode, the internal transcribed spacer (ITS) region (ITS1, 5.8S, ITS2), is commonly used in clinical routine laboratories due to its simplicity, universality, efficacy, and affordability for fungal species identification. However, Sanger sequencing fails to identify mixed ITS sequences in the case of mixed infections. To overcome this limitation, different high-throughput sequencing technologies have been explored. The nanopore-based technology is now one of the most promising long-read sequencing technologies on the market as it has the potential to sequence the full-length ITS region in a single read. In this study, we established a workflow for species identification using the sequences of the entire ITS region generated by nanopore sequencing of both pure yeast isolates and mocked mixed species reads generated with different scenarios. The species used in this study included Candida albicans (n = 2), Candida tropicalis (n = 1), Nakaseomyces glabratus (formerly Candida glabrata) (n = 1), Trichosporon asahii (n = 2), Pichia kudriavzevii (formerly Candida krusei) (n = 1), and Cryptococcus neoformans (n = 1). Comparing various methods to generate the consensus sequence for fungal species identification, the results from this study indicate that read clustering using a modified version of the NanoCLUST pipeline is more sensitive than Canu or VSEARCH, as it classified species accurately with a lower abundance cluster of reads (3% abundance compared to 10% with VSEARCH). The modified NanoCLUST also reduced the number of classified clusters compared to VSEARCH, making the subsequent BLAST+ analysis faster. Subsampling of the datasets, which reduces the size of the datasets by approximately tenfold, did not significantly affect the identification results in terms of the identified species name, percent identity, query coverage, percentage of reads in the classified cluster, and the number of clusters. The ability of the method to distinguish mixed species within sub-populations of large datasets has the potential to aid computer analysis by reducing the required processing power. The herein presented new sequence analysis pipeline will facilitate better interpretation of fungal sequence data for species identification. 2023-09-15T18:00:52Z 2023-09-15T18:00:52Z 2023-12-01 Article IMA Fungus Vol.14 No.1 (2023) 10.1186/s43008-023-00125-6 22106359 22106340 2-s2.0-85170059053 https://repository.li.mahidol.ac.th/handle/123456789/90009 SCOPUS
institution	Mahidol University
building	Mahidol University Library
continent	Asia
country	Thailand Thailand
content_provider	Mahidol University Library
collection	Mahidol University Institutional Repository
topic	Agricultural and Biological Sciences
spellingShingle	Agricultural and Biological Sciences Langsiri N. Targeted sequencing analysis pipeline for species identification of human pathogenic fungi using long-read nanopore sequencing
description	Among molecular-based techniques for fungal identification, Sanger sequencing of the primary universal fungal DNA barcode, the internal transcribed spacer (ITS) region (ITS1, 5.8S, ITS2), is commonly used in clinical routine laboratories due to its simplicity, universality, efficacy, and affordability for fungal species identification. However, Sanger sequencing fails to identify mixed ITS sequences in the case of mixed infections. To overcome this limitation, different high-throughput sequencing technologies have been explored. The nanopore-based technology is now one of the most promising long-read sequencing technologies on the market as it has the potential to sequence the full-length ITS region in a single read. In this study, we established a workflow for species identification using the sequences of the entire ITS region generated by nanopore sequencing of both pure yeast isolates and mocked mixed species reads generated with different scenarios. The species used in this study included Candida albicans (n = 2), Candida tropicalis (n = 1), Nakaseomyces glabratus (formerly Candida glabrata) (n = 1), Trichosporon asahii (n = 2), Pichia kudriavzevii (formerly Candida krusei) (n = 1), and Cryptococcus neoformans (n = 1). Comparing various methods to generate the consensus sequence for fungal species identification, the results from this study indicate that read clustering using a modified version of the NanoCLUST pipeline is more sensitive than Canu or VSEARCH, as it classified species accurately with a lower abundance cluster of reads (3% abundance compared to 10% with VSEARCH). The modified NanoCLUST also reduced the number of classified clusters compared to VSEARCH, making the subsequent BLAST+ analysis faster. Subsampling of the datasets, which reduces the size of the datasets by approximately tenfold, did not significantly affect the identification results in terms of the identified species name, percent identity, query coverage, percentage of reads in the classified cluster, and the number of clusters. The ability of the method to distinguish mixed species within sub-populations of large datasets has the potential to aid computer analysis by reducing the required processing power. The herein presented new sequence analysis pipeline will facilitate better interpretation of fungal sequence data for species identification.
author2	Mahidol University
author_facet	Mahidol University Langsiri N.
format	Article
author	Langsiri N.
author_sort	Langsiri N.
title	Targeted sequencing analysis pipeline for species identification of human pathogenic fungi using long-read nanopore sequencing
title_short	Targeted sequencing analysis pipeline for species identification of human pathogenic fungi using long-read nanopore sequencing
title_full	Targeted sequencing analysis pipeline for species identification of human pathogenic fungi using long-read nanopore sequencing
title_fullStr	Targeted sequencing analysis pipeline for species identification of human pathogenic fungi using long-read nanopore sequencing
title_full_unstemmed	Targeted sequencing analysis pipeline for species identification of human pathogenic fungi using long-read nanopore sequencing
title_sort	targeted sequencing analysis pipeline for species identification of human pathogenic fungi using long-read nanopore sequencing
publishDate	2023
url	https://repository.li.mahidol.ac.th/handle/123456789/90009
_version_	1781414498672836608

Targeted sequencing analysis pipeline for species identification of human pathogenic fungi using long-read nanopore sequencing

Similar Items