Analyzing DNA Sequences Using Clustering Algorithm

Data mining gives a bright prospective in DNA sequences analysis through its concepts and techniques. This study carries out exploratory data analysis method to cluster DNA sequences.Feature vectors have been developed to map the DNA sequences to a twelve-dimensional vector in the space. Lysozyme, M...

Full description

Saved in:

Bibliographic Details
Main Author:	Alhersh, Taha Talib Ragheb
Format:	Thesis
Language:	English English
Published:	2009
Subjects:	QA76 Computer software
Online Access:	https://etd.uum.edu.my/1913/1/Taha_Taleb_Ragheb_Alhersh.pdf https://etd.uum.edu.my/1913/2/1.Taha_Taleb_Ragheb_Alhersh.pdf https://etd.uum.edu.my/1913/
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Universiti Utara Malaysia
Language:	English English

id	my.uum.etd.1913
record_format	eprints
spelling	my.uum.etd.19132022-04-21T03:28:29Z https://etd.uum.edu.my/1913/ Analyzing DNA Sequences Using Clustering Algorithm Alhersh, Taha Talib Ragheb QA76 Computer software Data mining gives a bright prospective in DNA sequences analysis through its concepts and techniques. This study carries out exploratory data analysis method to cluster DNA sequences.Feature vectors have been developed to map the DNA sequences to a twelve-dimensional vector in the space. Lysozyme, Myoglobin and Rhodopsin protein families have been tested in this space. The results of DNA sequences comparison among homologous sequences give close distances between their characterization vectors which are easily distinguishable from non-homologous in experiment it with a fixed DNA sequence size that does not exceed the maximum length of the shortest DNA sequence. Global comparison for multiple DNA sequences simultaneously presented in the genomic space is the main advantage of this work by applying direct comparison of the corresponding characteristic vectors distances. The novelty of this work is that for the new DNA sequence, there is no need to compare the new DNA sequence with the whole DNA sequences length, just the comparison focused on a fixed number of all the sequences in a way that does not exceed the maximum length of the new DNA sequence. In other words, parts of the DNA sequence can identify the functionality of the DNA sequence, and make it clustered with its family members. 2009 Thesis NonPeerReviewed text en https://etd.uum.edu.my/1913/1/Taha_Taleb_Ragheb_Alhersh.pdf text en https://etd.uum.edu.my/1913/2/1.Taha_Taleb_Ragheb_Alhersh.pdf Alhersh, Taha Talib Ragheb (2009) Analyzing DNA Sequences Using Clustering Algorithm. Masters thesis, Universiti Utara Malaysia.
institution	Universiti Utara Malaysia
building	UUM Library
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Utara Malaysia
content_source	UUM Electronic Theses
url_provider	http://etd.uum.edu.my/
language	English English
topic	QA76 Computer software
spellingShingle	QA76 Computer software Alhersh, Taha Talib Ragheb Analyzing DNA Sequences Using Clustering Algorithm
description	Data mining gives a bright prospective in DNA sequences analysis through its concepts and techniques. This study carries out exploratory data analysis method to cluster DNA sequences.Feature vectors have been developed to map the DNA sequences to a twelve-dimensional vector in the space. Lysozyme, Myoglobin and Rhodopsin protein families have been tested in this space. The results of DNA sequences comparison among homologous sequences give close distances between their characterization vectors which are easily distinguishable from non-homologous in experiment it with a fixed DNA sequence size that does not exceed the maximum length of the shortest DNA sequence. Global comparison for multiple DNA sequences simultaneously presented in the genomic space is the main advantage of this work by applying direct comparison of the corresponding characteristic vectors distances. The novelty of this work is that for the new DNA sequence, there is no need to compare the new DNA sequence with the whole DNA sequences length, just the comparison focused on a fixed number of all the sequences in a way that does not exceed the maximum length of the new DNA sequence. In other words, parts of the DNA sequence can identify the functionality of the DNA sequence, and make it clustered with its family members.
format	Thesis
author	Alhersh, Taha Talib Ragheb
author_facet	Alhersh, Taha Talib Ragheb
author_sort	Alhersh, Taha Talib Ragheb
title	Analyzing DNA Sequences Using Clustering Algorithm
title_short	Analyzing DNA Sequences Using Clustering Algorithm
title_full	Analyzing DNA Sequences Using Clustering Algorithm
title_fullStr	Analyzing DNA Sequences Using Clustering Algorithm
title_full_unstemmed	Analyzing DNA Sequences Using Clustering Algorithm
title_sort	analyzing dna sequences using clustering algorithm
publishDate	2009
url	https://etd.uum.edu.my/1913/1/Taha_Taleb_Ragheb_Alhersh.pdf https://etd.uum.edu.my/1913/2/1.Taha_Taleb_Ragheb_Alhersh.pdf https://etd.uum.edu.my/1913/
_version_	1731228102433964032

Analyzing DNA Sequences Using Clustering Algorithm

Similar Items