Visualization and sharing of genomics data via a cloud based system

The Final Year Project, Visualization and Sharing of Genomics Data via a Cloud Based System, documented on the relationships between Cloud Computing, Next-Generation-Sequencing (NGS), Galaxy, Integrated Genome Browser (IGB) and UCSC Genome Browser. Due to the vast amount of Genomics data involved in...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Chen, Guohao
مؤلفون آخرون: Zheng Jie
التنسيق: Final Year Project
اللغة:English
منشور في: 2015
الموضوعات:
الوصول للمادة أونلاين:http://hdl.handle.net/10356/62704
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
الوصف
الملخص:The Final Year Project, Visualization and Sharing of Genomics Data via a Cloud Based System, documented on the relationships between Cloud Computing, Next-Generation-Sequencing (NGS), Galaxy, Integrated Genome Browser (IGB) and UCSC Genome Browser. Due to the vast amount of Genomics data involved in the renowned technology Next-Generation-Sequencing (NGS), Galaxy (An open source, web-based platform for data intensive biomedical research) adopted Cloud Computing as a potential methodology to remedy the storage, processing and sharing of data. A detailed guide from depositing data, installing of Galaxy to the hosting of Galaxy were included in this report with proper configurations and recommendations attached. It is important to note that Galaxy no longer supported the distribution of Windows platform and thus, Ubuntu (A community developed, GNU/Linux based Free/Open Source operating system) was adopted as a substitution for development in a Linux platform. Development on Galaxy was also made possible by leveraging on the API key generated by Galaxy where users could perform analysis on a Terminal instead. Galaxy was further migrated to existing Cloud infrastructure of Nanyang Technological University, School of Computer Engineering where users were able to take advantage of its high availability, performance capability and the privilege of enjoying scalability in the computing resources. Benchmarking was performed on a single workstation together with NTU-SCE Cloud services and the result shows the latter outperformed the former significantly. External web applications like UCSC Genome Browser and Integrated Genome Browser (IGB) were also introduced to enhanced users’ experience in performing data analysis. A total of three recommendations each for hosting Galaxy on the Cloud concluded that the trade-off for performance and availability comes with great financial cost.