Improved ENSPART for DNA Motif Prediction
In our previous work we proposed ENSPART-an ensemble method for DNA motif discovery which partitions input dataset into several equal size subsets runs by several distinct tools for candidate motif prediction. The candidate motifs obtained from different data subsets are merged to obtain the final m...
Saved in:
Main Authors: | , , , |
---|---|
Format: | E-Article |
Language: | English |
Published: |
Universiti Malaysia Sarawak (UNIMAS)
2017
|
Subjects: | |
Online Access: | http://ir.unimas.my/id/eprint/19016/1/SCT-073-revised-deposit%20%28abstrak%29.pdf http://ir.unimas.my/id/eprint/19016/ http://www.ijbs.unimas.my/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Universiti Malaysia Sarawak |
Language: | English |
id |
my.unimas.ir.19016 |
---|---|
record_format |
eprints |
spelling |
my.unimas.ir.190162018-01-03T06:06:16Z http://ir.unimas.my/id/eprint/19016/ Improved ENSPART for DNA Motif Prediction Choong, Allen Chieng Hoon Lee, Nung Kion Bong, Chih How Norshafarina, Omar Q Science (General) T Technology (General) In our previous work we proposed ENSPART-an ensemble method for DNA motif discovery which partitions input dataset into several equal size subsets runs by several distinct tools for candidate motif prediction. The candidate motifs obtained from different data subsets are merged to obtain the final motifs. Nevertheless, the original ENSPART has several limitations: (1) the same background sequences are used for the calculation of Receiver Operating Cost (ROC) of motifs obtained from different datasets. This causes bias because different datasets might have different background distribution; (2) it does not consider the duplication of a motif and its reverse complement. This causes many redundant motifs in the result set which requires filtering. In this work, we extended the original ENSPART to solve those two issues. For the first issue, we employed background sequences that is based on the distribution of bases in the input sequences. As for the second issue, we employ a "triple" merging strategy to reduce redundant motifs. Our evaluation results indicate that the two improvements obtain better AUC values in comparison to the original implementation. Universiti Malaysia Sarawak (UNIMAS) 2017-12 E-Article PeerReviewed text en http://ir.unimas.my/id/eprint/19016/1/SCT-073-revised-deposit%20%28abstrak%29.pdf Choong, Allen Chieng Hoon and Lee, Nung Kion and Bong, Chih How and Norshafarina, Omar (2017) Improved ENSPART for DNA Motif Prediction. International Journal of Business and Society, 18 (S4). pp. 1-6. ISSN 15116670 http://www.ijbs.unimas.my/ |
institution |
Universiti Malaysia Sarawak |
building |
Centre for Academic Information Services (CAIS) |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Malaysia Sarawak |
content_source |
UNIMAS Institutional Repository |
url_provider |
http://ir.unimas.my/ |
language |
English |
topic |
Q Science (General) T Technology (General) |
spellingShingle |
Q Science (General) T Technology (General) Choong, Allen Chieng Hoon Lee, Nung Kion Bong, Chih How Norshafarina, Omar Improved ENSPART for DNA Motif Prediction |
description |
In our previous work we proposed ENSPART-an ensemble method for DNA motif discovery which partitions input dataset into several equal size subsets runs by several distinct tools for candidate motif prediction. The candidate motifs obtained from different data subsets are merged to obtain the final motifs. Nevertheless, the original ENSPART has several limitations: (1) the same background sequences are used for the calculation of Receiver Operating Cost (ROC) of motifs obtained from different datasets. This causes bias because different datasets might have different background distribution; (2) it does not consider the duplication of a motif and its reverse complement. This causes many redundant motifs in the result set which requires filtering. In this work, we extended the original ENSPART to solve those two issues. For the first issue, we employed background sequences that is based on the distribution of bases in the input sequences. As for the second issue, we employ a "triple" merging strategy to reduce redundant motifs. Our evaluation results indicate that the two improvements obtain better AUC values in comparison to the original implementation. |
format |
E-Article |
author |
Choong, Allen Chieng Hoon Lee, Nung Kion Bong, Chih How Norshafarina, Omar |
author_facet |
Choong, Allen Chieng Hoon Lee, Nung Kion Bong, Chih How Norshafarina, Omar |
author_sort |
Choong, Allen Chieng Hoon |
title |
Improved ENSPART for DNA Motif Prediction |
title_short |
Improved ENSPART for DNA Motif Prediction |
title_full |
Improved ENSPART for DNA Motif Prediction |
title_fullStr |
Improved ENSPART for DNA Motif Prediction |
title_full_unstemmed |
Improved ENSPART for DNA Motif Prediction |
title_sort |
improved enspart for dna motif prediction |
publisher |
Universiti Malaysia Sarawak (UNIMAS) |
publishDate |
2017 |
url |
http://ir.unimas.my/id/eprint/19016/1/SCT-073-revised-deposit%20%28abstrak%29.pdf http://ir.unimas.my/id/eprint/19016/ http://www.ijbs.unimas.my/ |
_version_ |
1644512979187662848 |