Filtering of Background DNA Sequences Improves DNA Motif Prediction Using Clustering Techniques
Noisy objects have been known to affect negatively on the performance of clustering algorithms. This paper addresses the problem of high false positive rates in using self-organizing map (SOM) for DNA motif prediction due to the noisy background sequences in the input dataset. We propose the use of...
Saved in:
Main Authors: | , |
---|---|
Format: | E-Article |
Language: | English |
Published: |
Elsevier
2013
|
Subjects: | |
Online Access: | http://ir.unimas.my/id/eprint/11945/1/Filtering%20of%20background%20DNA_abstract.pdf http://ir.unimas.my/id/eprint/11945/ http://ac.els-cdn.com/S1877042813037245/1-s2.0-S1877042813037245-main.pdf?_tid=9ff50ec4-135b-11e6-b07e-00000aab0f26&acdnat=1462519672_d9a1dd367fa2434926676d8ad2649fd1 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Universiti Malaysia Sarawak |
Language: | English |
Summary: | Noisy objects have been known to affect negatively on the performance of clustering algorithms. This paper addresses the problem of high false positive rates in using self-organizing map (SOM) for DNA motif prediction due to the noisy background sequences in the input dataset. We propose the use of sequence filter in the pre-processing step to remove portion of the noisy background before applying to the SOM. Our method is motivated by the evolutionary conservation property of binding sites as opposed to randomness of background sequences. Our contributions are: (a) propose the use of string mismatch as filtering
threshold function; and (b) two filtering methods, namely sequence driven and gapped consensus pattern, are proposed for filtering. We employed real datasets to evaluate the performance of SOM for DNA prediction after the filtering process. Our evaluation results show promising improvements in term of precision rates and also data reduction. We conclude that filtering background sequences is a feasible solution to improve prediction accuracy of using SOM for DNA motif prediction. |
---|