Unraveling high-throughput demultiplexing techniques across multiple plant species

RNA sequencing (RNA-seq) is essential for understanding biological mechanisms in plant biology. RNA-seq samples are pooled together (multiplexed) for simultaneous sequencing. Traditional demultiplexing methods often rely on expensive barcode matching, leading to collisions—misidentifications of samp...

Full description

Saved in:
Bibliographic Details
Main Author: Maitra, Ishani
Other Authors: Marek Mutwil
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/176353
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-176353
record_format dspace
spelling sg-ntu-dr.10356-1763532024-05-20T15:33:06Z Unraveling high-throughput demultiplexing techniques across multiple plant species Maitra, Ishani Marek Mutwil School of Biological Sciences mutwil@ntu.edu.sg Medicine, Health and Life Sciences Demultiplexing RNA sequencing (RNA-seq) is essential for understanding biological mechanisms in plant biology. RNA-seq samples are pooled together (multiplexed) for simultaneous sequencing. Traditional demultiplexing methods often rely on expensive barcode matching, leading to collisions—misidentifications of samples due to sequencing noise or inaccuracies in barcode assignment, especially in complex data. Therefore, we proposed a cost-efficient demultiplexing method that can accommodate complex datasets. The method is tested on Arabidopsis thaliana, Brachypodium distachyon, and Oldenlandia corymbosa, with A. thaliana and B. distachyon subjected to dark stress treatment. The samples are pooled together in various multiplex combinations. RNA sequences were aligned to a reference coding sequence (CDS) genome using HISAT2. A multiplex CDS was achieved by concatenating the three species’ reference genomes. A strong correlation was observed and suggested that multiplex CDS can be used for subsequent comparative analysis. The control read counts were scaled according to the observed linear relationship between O. corymbosa gene read counts in both control and treatment groups within the multiplex ABO samples. DEGs were precisely identified using DESeq2 and a proposed differential gene expression analysis on scaled control read counts. We demonstrated a promising cost-efficient demultiplexing method capable of handling large and complex datasets without the need for barcoding. Bachelor's degree 2024-05-17T13:31:29Z 2024-05-17T13:31:29Z 2024 Final Year Project (FYP) Maitra, I. (2024). Unraveling high-throughput demultiplexing techniques across multiple plant species. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/176353 https://hdl.handle.net/10356/176353 en application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Medicine, Health and Life Sciences
Demultiplexing
spellingShingle Medicine, Health and Life Sciences
Demultiplexing
Maitra, Ishani
Unraveling high-throughput demultiplexing techniques across multiple plant species
description RNA sequencing (RNA-seq) is essential for understanding biological mechanisms in plant biology. RNA-seq samples are pooled together (multiplexed) for simultaneous sequencing. Traditional demultiplexing methods often rely on expensive barcode matching, leading to collisions—misidentifications of samples due to sequencing noise or inaccuracies in barcode assignment, especially in complex data. Therefore, we proposed a cost-efficient demultiplexing method that can accommodate complex datasets. The method is tested on Arabidopsis thaliana, Brachypodium distachyon, and Oldenlandia corymbosa, with A. thaliana and B. distachyon subjected to dark stress treatment. The samples are pooled together in various multiplex combinations. RNA sequences were aligned to a reference coding sequence (CDS) genome using HISAT2. A multiplex CDS was achieved by concatenating the three species’ reference genomes. A strong correlation was observed and suggested that multiplex CDS can be used for subsequent comparative analysis. The control read counts were scaled according to the observed linear relationship between O. corymbosa gene read counts in both control and treatment groups within the multiplex ABO samples. DEGs were precisely identified using DESeq2 and a proposed differential gene expression analysis on scaled control read counts. We demonstrated a promising cost-efficient demultiplexing method capable of handling large and complex datasets without the need for barcoding.
author2 Marek Mutwil
author_facet Marek Mutwil
Maitra, Ishani
format Final Year Project
author Maitra, Ishani
author_sort Maitra, Ishani
title Unraveling high-throughput demultiplexing techniques across multiple plant species
title_short Unraveling high-throughput demultiplexing techniques across multiple plant species
title_full Unraveling high-throughput demultiplexing techniques across multiple plant species
title_fullStr Unraveling high-throughput demultiplexing techniques across multiple plant species
title_full_unstemmed Unraveling high-throughput demultiplexing techniques across multiple plant species
title_sort unraveling high-throughput demultiplexing techniques across multiple plant species
publisher Nanyang Technological University
publishDate 2024
url https://hdl.handle.net/10356/176353
_version_ 1814047059392593920