EXPRESSION PROFILE AND DIFFERENTIAL ANALYSIS OF SOYBEANS (GLYCINE MAX) RNA-SEQ USING K-MEANS CLUSTERING APPROACH ON VARIOUS SEED DEVELOPMENTAL STAGES
Soybean (Glycine max) is a legume plant with a high commercial value because of its high protein and lipid contents. Beside being an important staple food, soybean can be used to produce oil. Thus, it is important to learn and research about its gene expression activities during seed developmenta...
Saved in:
Main Author: | |
---|---|
Format: | Final Project |
Language: | Indonesia |
Subjects: | |
Online Access: | https://digilib.itb.ac.id/gdl/view/62347 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
Summary: | Soybean (Glycine max) is a legume plant with a high commercial value because of its
high protein and lipid contents. Beside being an important staple food, soybean can be used to
produce oil. Thus, it is important to learn and research about its gene expression activities
during seed developmental stages and how they can benefit the uses of soybean. This study
aims to determine the expression profile and pathways which undergo changes during the
developmental stages according to K-Means Clustering analysis. This analysis was done on 5
kinds of RNA-Seq samples with 2 replications on 5 developmental stages which are globular
stage (GS), heart stage (HS), cotyledon stage (CS), maturation stage (MS), and dry seed (DS).
RNA-Seq analysis started with quality control of the data using fasp (v0.20.1), alignment using
HISAT2 (v2.2.1), and quality measurement of the alignment using PICARD (v2.18.2.2). Reads
measurement was done using Htseq-Count (v.0.9.1) before k-means clustering, enrichment,
and differential analyses were done using DEseq2 on the iDEP website (v0.93). The results
showed that there were 4 kinds of gene clusters (A, B, C, D) with different expression
dynamics. Cluster A included pathways of protein processing, galactose and glutathione
metabolisms which increased gradually, following the developmental stages (GS-DS). Cluster
B consisted of diterpenoid biosynthesis pathway which expression peaked in MS. Cluster G
was a cluster of 10 pathways that decreased in expression, especially in flavonoid biosynthesis
which decreased significantly in DS. The D cluster of genes that included linoleic acid
metabolism also decreased in expression gradually during seed developmental stages.
|
---|