Analysis of aneuploidy spectrum from whole-genome sequencing provides rapid assessment of clonal variation within established cancer cell lines

BACKGROUND: The revolution in next-generation sequencing (NGS) technology has allowed easy access and sharing of high-throughput sequencing datasets of cancer cell lines and their integrative analyses. However, long-term passaging and culture conditions introduce high levels of genomic and phenotypi...

Full description

Saved in:
Bibliographic Details
Main Authors: Ahmed Ibrahim Samir Khalil, Chattopadhyay, Anupam, Sanyal, Amartya
Other Authors: School of Computer Science and Engineering
Format: Article
Language:English
Published: 2022
Subjects:
Online Access:https://hdl.handle.net/10356/153768
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-153768
record_format dspace
spelling sg-ntu-dr.10356-1537682023-02-28T17:11:36Z Analysis of aneuploidy spectrum from whole-genome sequencing provides rapid assessment of clonal variation within established cancer cell lines Ahmed Ibrahim Samir Khalil Chattopadhyay, Anupam Sanyal, Amartya School of Computer Science and Engineering School of Biological Sciences Engineering::Computer science and engineering Science::Biological sciences Aneuploidy Spectrum Cancer Cell Lines BACKGROUND: The revolution in next-generation sequencing (NGS) technology has allowed easy access and sharing of high-throughput sequencing datasets of cancer cell lines and their integrative analyses. However, long-term passaging and culture conditions introduce high levels of genomic and phenotypic diversity in established cell lines resulting in strain differences. Thus, clonal variation in cultured cell lines with respect to the reference standard is a major barrier in systems biology data analyses. Therefore, there is a pressing need for a fast and entry-level assessment of clonal variations within cell lines using their high-throughput sequencing data. RESULTS: We developed a Python-based software, AStra, for de novo estimation of the genome-wide segmental aneuploidy to measure and visually interpret strain-level similarities or differences of cancer cell lines from whole-genome sequencing (WGS). We demonstrated that aneuploidy spectrum can capture the genetic variations in 27 strains of MCF7 breast cancer cell line collected from different laboratories. Performance evaluation of AStra using several cancer sequencing datasets revealed that cancer cell lines exhibit distinct aneuploidy spectra which reflect their previously-reported karyotypic observations. Similarly, AStra successfully identified large-scale DNA copy number variations (CNVs) artificially introduced in simulated WGS datasets. CONCLUSIONS: AStra provides an analytical and visualization platform for rapid and easy comparison between different strains or between cell lines based on their aneuploidy spectra solely using the raw BAM files representing mapped reads. We recommend AStra for rapid first-pass quality assessment of cancer cell lines before integrating scientific datasets that employ deep sequencing. AStra is an open-source software and is available at https://github.com/AISKhalil/AStra. Ministry of Education (MOE) Nanyang Technological University Published version This work was supported by the Nanyang Technological University’s Nanyang Assistant Professorship grant and Singapore Ministry of Education Academic Research Fund Tier 1 grant (RG39/18) to AS. AC is supported by the Nanyang Technological University start-up grant. 2022-06-01T05:29:28Z 2022-06-01T05:29:28Z 2021 Journal Article Ahmed Ibrahim Samir Khalil, Chattopadhyay, A. & Sanyal, A. (2021). Analysis of aneuploidy spectrum from whole-genome sequencing provides rapid assessment of clonal variation within established cancer cell lines. Cancer Informatics, 20, 1-9. https://dx.doi.org/10.1177/11769351211049236 1176-9351 https://hdl.handle.net/10356/153768 10.1177/11769351211049236 20 1 9 en RG39/18 Cancer Informatics © 2021 The Author(s). This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 License (https://creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/open-access-at-sage). application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Computer science and engineering
Science::Biological sciences
Aneuploidy Spectrum
Cancer Cell Lines
spellingShingle Engineering::Computer science and engineering
Science::Biological sciences
Aneuploidy Spectrum
Cancer Cell Lines
Ahmed Ibrahim Samir Khalil
Chattopadhyay, Anupam
Sanyal, Amartya
Analysis of aneuploidy spectrum from whole-genome sequencing provides rapid assessment of clonal variation within established cancer cell lines
description BACKGROUND: The revolution in next-generation sequencing (NGS) technology has allowed easy access and sharing of high-throughput sequencing datasets of cancer cell lines and their integrative analyses. However, long-term passaging and culture conditions introduce high levels of genomic and phenotypic diversity in established cell lines resulting in strain differences. Thus, clonal variation in cultured cell lines with respect to the reference standard is a major barrier in systems biology data analyses. Therefore, there is a pressing need for a fast and entry-level assessment of clonal variations within cell lines using their high-throughput sequencing data. RESULTS: We developed a Python-based software, AStra, for de novo estimation of the genome-wide segmental aneuploidy to measure and visually interpret strain-level similarities or differences of cancer cell lines from whole-genome sequencing (WGS). We demonstrated that aneuploidy spectrum can capture the genetic variations in 27 strains of MCF7 breast cancer cell line collected from different laboratories. Performance evaluation of AStra using several cancer sequencing datasets revealed that cancer cell lines exhibit distinct aneuploidy spectra which reflect their previously-reported karyotypic observations. Similarly, AStra successfully identified large-scale DNA copy number variations (CNVs) artificially introduced in simulated WGS datasets. CONCLUSIONS: AStra provides an analytical and visualization platform for rapid and easy comparison between different strains or between cell lines based on their aneuploidy spectra solely using the raw BAM files representing mapped reads. We recommend AStra for rapid first-pass quality assessment of cancer cell lines before integrating scientific datasets that employ deep sequencing. AStra is an open-source software and is available at https://github.com/AISKhalil/AStra.
author2 School of Computer Science and Engineering
author_facet School of Computer Science and Engineering
Ahmed Ibrahim Samir Khalil
Chattopadhyay, Anupam
Sanyal, Amartya
format Article
author Ahmed Ibrahim Samir Khalil
Chattopadhyay, Anupam
Sanyal, Amartya
author_sort Ahmed Ibrahim Samir Khalil
title Analysis of aneuploidy spectrum from whole-genome sequencing provides rapid assessment of clonal variation within established cancer cell lines
title_short Analysis of aneuploidy spectrum from whole-genome sequencing provides rapid assessment of clonal variation within established cancer cell lines
title_full Analysis of aneuploidy spectrum from whole-genome sequencing provides rapid assessment of clonal variation within established cancer cell lines
title_fullStr Analysis of aneuploidy spectrum from whole-genome sequencing provides rapid assessment of clonal variation within established cancer cell lines
title_full_unstemmed Analysis of aneuploidy spectrum from whole-genome sequencing provides rapid assessment of clonal variation within established cancer cell lines
title_sort analysis of aneuploidy spectrum from whole-genome sequencing provides rapid assessment of clonal variation within established cancer cell lines
publishDate 2022
url https://hdl.handle.net/10356/153768
_version_ 1759855890017550336