Visualization and sharing of genomics data via a cloud based system
The Final Year Project, Visualization and Sharing of Genomics Data via a Cloud Based System, documented on the relationships between Cloud Computing, Next-Generation-Sequencing (NGS), Galaxy, Integrated Genome Browser (IGB) and UCSC Genome Browser. Due to the vast amount of Genomics data involved in...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
2015
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/62704 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | The Final Year Project, Visualization and Sharing of Genomics Data via a Cloud Based System, documented on the relationships between Cloud Computing, Next-Generation-Sequencing (NGS), Galaxy, Integrated Genome Browser (IGB) and UCSC Genome Browser. Due to the vast amount of Genomics data involved in the renowned technology Next-Generation-Sequencing (NGS), Galaxy (An open source, web-based platform for data intensive biomedical research) adopted Cloud Computing as a potential methodology to remedy the storage, processing and sharing of data. A detailed guide from depositing data, installing of Galaxy to the hosting of Galaxy were included in this report with proper configurations and recommendations attached. It is important to note that Galaxy no longer supported the distribution of Windows platform and thus, Ubuntu (A community developed, GNU/Linux based Free/Open Source operating system) was adopted as a substitution for development in a Linux platform. Development on Galaxy was also made possible by leveraging on the API key generated by Galaxy where users could perform analysis on a Terminal instead. Galaxy was further migrated to existing Cloud infrastructure of Nanyang Technological University, School of Computer Engineering where users were able to take advantage of its high availability, performance capability and the privilege of enjoying scalability in the computing resources. Benchmarking was performed on a single workstation together with NTU-SCE Cloud services and the result shows the latter outperformed the former significantly. External web applications like UCSC Genome Browser and Integrated Genome Browser (IGB) were also introduced to enhanced users’ experience in performing data analysis. A total of three recommendations each for hosting Galaxy on the Cloud concluded that the trade-off for performance and availability comes with great financial cost. |
---|