Unraveling high-throughput demultiplexing techniques across multiple plant species
RNA sequencing (RNA-seq) is essential for understanding biological mechanisms in plant biology. RNA-seq samples are pooled together (multiplexed) for simultaneous sequencing. Traditional demultiplexing methods often rely on expensive barcode matching, leading to collisions—misidentifications of samp...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/176353 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | RNA sequencing (RNA-seq) is essential for understanding biological mechanisms in plant biology. RNA-seq samples are pooled together (multiplexed) for simultaneous sequencing. Traditional demultiplexing methods often rely on expensive barcode matching, leading to collisions—misidentifications of samples due to sequencing noise or inaccuracies in barcode assignment, especially in complex data. Therefore, we proposed a cost-efficient demultiplexing method that can accommodate complex datasets. The method is tested on Arabidopsis thaliana, Brachypodium distachyon, and Oldenlandia corymbosa, with A. thaliana and B. distachyon subjected to dark stress treatment. The samples are pooled together in various multiplex combinations. RNA sequences were aligned to a reference coding sequence (CDS) genome using HISAT2. A multiplex CDS was achieved by concatenating the three species’ reference genomes. A strong correlation was observed and suggested that multiplex CDS can be used for subsequent comparative analysis. The control read counts were scaled according to the observed linear relationship between O. corymbosa gene read counts in both control and treatment groups within the multiplex ABO samples. DEGs were precisely identified using DESeq2 and a proposed differential gene expression analysis on scaled control read counts. We demonstrated a promising cost-efficient demultiplexing method capable of handling large and complex datasets without the need for barcoding. |
---|