Motif search using Gibbs sampling: Notes on effectiveness in a distributed environment

© 2019 IEEE. Motif search is a common problem in bioinformatics where unique DNA sequences (motifs) of a specific length inscribed in long strands signify binding sites for transcription factors. In this paper, we present some important notes on the implementation of motif search using Gibbs samplin...

Full description

Saved in:
Bibliographic Details
Main Authors: Imperial, Joseph Marvin R., Gail Ya-On, Czeritonnie, Cu, Gregory G.
Format: text
Published: Animo Repository 2019
Online Access:https://animorepository.dlsu.edu.ph/faculty_research/947
https://animorepository.dlsu.edu.ph/context/faculty_research/article/1946/type/native/viewcontent
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: De La Salle University
id oai:animorepository.dlsu.edu.ph:faculty_research-1946
record_format eprints
spelling oai:animorepository.dlsu.edu.ph:faculty_research-19462022-07-21T00:34:42Z Motif search using Gibbs sampling: Notes on effectiveness in a distributed environment Imperial, Joseph Marvin R. Gail Ya-On, Czeritonnie Cu, Gregory G. © 2019 IEEE. Motif search is a common problem in bioinformatics where unique DNA sequences (motifs) of a specific length inscribed in long strands signify binding sites for transcription factors. In this paper, we present some important notes on the implementation of motif search using Gibbs sampling algorithm in a distributed computing environment by analyzing visualization on speed and motif scoring of various distributed implementations. For the DNA sequences data, we used an open-source mouse genome fragments with lengths 250, 500, and 1000. We built upon our previous studies (Perera and Ragel, 2013; Chen and Jiang, 2006) by integrating a distributed environment of the motif search workloads (jobs) across 16 CPU cores contained on 2 computer nodes instead of the traditional way of parallelizing on a single computing device with multicore CPUs. Results show that using saving the DNA sequences in list and adding as a parameter argument obtained the fastest execution time compared to implementations by sending file dependencies and in-memory processing. 2019-11-01T07:00:00Z text text/html https://animorepository.dlsu.edu.ph/faculty_research/947 https://animorepository.dlsu.edu.ph/context/faculty_research/article/1946/type/native/viewcontent Faculty Research Work Animo Repository
institution De La Salle University
building De La Salle University Library
continent Asia
country Philippines
Philippines
content_provider De La Salle University Library
collection DLSU Institutional Repository
description © 2019 IEEE. Motif search is a common problem in bioinformatics where unique DNA sequences (motifs) of a specific length inscribed in long strands signify binding sites for transcription factors. In this paper, we present some important notes on the implementation of motif search using Gibbs sampling algorithm in a distributed computing environment by analyzing visualization on speed and motif scoring of various distributed implementations. For the DNA sequences data, we used an open-source mouse genome fragments with lengths 250, 500, and 1000. We built upon our previous studies (Perera and Ragel, 2013; Chen and Jiang, 2006) by integrating a distributed environment of the motif search workloads (jobs) across 16 CPU cores contained on 2 computer nodes instead of the traditional way of parallelizing on a single computing device with multicore CPUs. Results show that using saving the DNA sequences in list and adding as a parameter argument obtained the fastest execution time compared to implementations by sending file dependencies and in-memory processing.
format text
author Imperial, Joseph Marvin R.
Gail Ya-On, Czeritonnie
Cu, Gregory G.
spellingShingle Imperial, Joseph Marvin R.
Gail Ya-On, Czeritonnie
Cu, Gregory G.
Motif search using Gibbs sampling: Notes on effectiveness in a distributed environment
author_facet Imperial, Joseph Marvin R.
Gail Ya-On, Czeritonnie
Cu, Gregory G.
author_sort Imperial, Joseph Marvin R.
title Motif search using Gibbs sampling: Notes on effectiveness in a distributed environment
title_short Motif search using Gibbs sampling: Notes on effectiveness in a distributed environment
title_full Motif search using Gibbs sampling: Notes on effectiveness in a distributed environment
title_fullStr Motif search using Gibbs sampling: Notes on effectiveness in a distributed environment
title_full_unstemmed Motif search using Gibbs sampling: Notes on effectiveness in a distributed environment
title_sort motif search using gibbs sampling: notes on effectiveness in a distributed environment
publisher Animo Repository
publishDate 2019
url https://animorepository.dlsu.edu.ph/faculty_research/947
https://animorepository.dlsu.edu.ph/context/faculty_research/article/1946/type/native/viewcontent
_version_ 1740844621904216064