GENOMIC SELECTION USING MULTIPLE LINEAR REGRESSION WITH REGULARIZATION METHODS: RIDGE REGRESSION, LASSO, AND ELASTIC NET
Natural resources play a crucial role in the sustainability of human life. The quality of natural resources varies from one to another. Not all natural resources have the desired quality by humans. With the development of technology, breeding methods have been discovered to produce resources with...
Saved in:
Main Author: | |
---|---|
Format: | Final Project |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/76254 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
Summary: | Natural resources play a crucial role in the sustainability of human life. The quality
of natural resources varies from one to another. Not all natural resources have
the desired quality by humans. With the development of technology, breeding
methods have been discovered to produce resources with desired qualities. One
of the methods used to predict the breeding results of these biological resources is
genomic selection. This method seeks to find the relationship between observable
phenotypes or characteristics and the genetic values possessed by an individual.
Genetic values are obtained by examining the values of Single Nucleotide Polymorphism
(SNP) or by observing the genetic values of individuals at specific positions.
By knowing the relationship between phenotypes and genetics, the desired
phenotype quality can be predicted by manipulating the genetic values of an
individual. In prediction, a multiple linear regression is used. To achieve better
predictive values, regularization techniques like Ridge Regression, LASSO, and
Elastic Net are applied. In this Final Project, a model is constructed using an
open dataset from a journal that discusses cotton quality in Australia.
Based on experiments, it is found that models using regularization provide the best
results for genomic selection. The predicted values of data that undergo predictor
and observation preprocessing are more accurate compared to data that are not
preprocessed. |
---|