Building a database of cancer genomics data

Cancer is known to be a genetic disease. Due to the inherent complexity of cancer, large-scale genomics datasets are therefore required to study the diseases. The purpose of this project, is to download publically available data about cancer (including DNA sequences, gene expression, and clinical tr...

Full description

Saved in:

Bibliographic Details
Main Author:	Chern, Shan Ni
Other Authors:	Zheng Jie
Format:	Final Year Project
Language:	English
Published:	2015
Subjects:	DRNTU::Engineering::Computer science and engineering::Information systems
Online Access:	http://hdl.handle.net/10356/62848
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-62848
record_format	dspace
spelling	sg-ntu-dr.10356-628482023-03-03T20:32:09Z Building a database of cancer genomics data Chern, Shan Ni Zheng Jie School of Computer Engineering DRNTU::Engineering::Computer science and engineering::Information systems Cancer is known to be a genetic disease. Due to the inherent complexity of cancer, large-scale genomics datasets are therefore required to study the diseases. The purpose of this project, is to download publically available data about cancer (including DNA sequences, gene expression, and clinical trial data) from TCGA (The Cancer Genome Atlas), RNASeqV2 is used to determine the gene expression level, and integrate them into a NoSQL database. The motivation for this project was to provide a centralized database to speed up and increase the accuracy of cancer research or data analysis with the ability to use large-scale genomics datasets. Programming interfaces was developed to ease the access and maintenance of the database. Along with the interfaces, different functions were developed for the database administrator and system user. Database administrator can perform insertion, edition, and deletion of datasets to the database. Insertion of datasets will allow the administrator to select a desired file to upload. Edition and deletion of the data will allow the administrator to select or search a particular record that was in the database. System user can perform query such as search for a particular gene’s information and average scaled estimate for a particular gene. The development tasks of this project follow closely to the iterative Software Development Lifecycle (SDLC) which will be presented in this report. With reference to the developed database, further development and enhancement can be made in order to better facilitate the application users. The future development work can include allowing the database to draw out the relationship between the genes found in different cancer with the building of a larger database. Bachelor of Engineering (Computer Science) 2015-04-30T02:38:38Z 2015-04-30T02:38:38Z 2015 2015 Final Year Project (FYP) http://hdl.handle.net/10356/62848 en Nanyang Technological University 62 p. application/pdf
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	DRNTU::Engineering::Computer science and engineering::Information systems
spellingShingle	DRNTU::Engineering::Computer science and engineering::Information systems Chern, Shan Ni Building a database of cancer genomics data
description	Cancer is known to be a genetic disease. Due to the inherent complexity of cancer, large-scale genomics datasets are therefore required to study the diseases. The purpose of this project, is to download publically available data about cancer (including DNA sequences, gene expression, and clinical trial data) from TCGA (The Cancer Genome Atlas), RNASeqV2 is used to determine the gene expression level, and integrate them into a NoSQL database. The motivation for this project was to provide a centralized database to speed up and increase the accuracy of cancer research or data analysis with the ability to use large-scale genomics datasets. Programming interfaces was developed to ease the access and maintenance of the database. Along with the interfaces, different functions were developed for the database administrator and system user. Database administrator can perform insertion, edition, and deletion of datasets to the database. Insertion of datasets will allow the administrator to select a desired file to upload. Edition and deletion of the data will allow the administrator to select or search a particular record that was in the database. System user can perform query such as search for a particular gene’s information and average scaled estimate for a particular gene. The development tasks of this project follow closely to the iterative Software Development Lifecycle (SDLC) which will be presented in this report. With reference to the developed database, further development and enhancement can be made in order to better facilitate the application users. The future development work can include allowing the database to draw out the relationship between the genes found in different cancer with the building of a larger database.
author2	Zheng Jie
author_facet	Zheng Jie Chern, Shan Ni
format	Final Year Project
author	Chern, Shan Ni
author_sort	Chern, Shan Ni
title	Building a database of cancer genomics data
title_short	Building a database of cancer genomics data
title_full	Building a database of cancer genomics data
title_fullStr	Building a database of cancer genomics data
title_full_unstemmed	Building a database of cancer genomics data
title_sort	building a database of cancer genomics data
publishDate	2015
url	http://hdl.handle.net/10356/62848
_version_	1759853762172682240

Building a database of cancer genomics data

Similar Items