Stable bulged G-quadruplexes in the human genome: identification, experimental validation and functionalization

DNA sequence composition determines the topology and stability of G-quadruplexes (G4s). Bulged G-quadruplex structures (G4-Bs) are a subset of G4s characterized by 3D conformations with bulges. Current search algorithms fail to capture stable G4-B, making their genome-wide study infeasible. Here, we...

Full description

Saved in:
Bibliographic Details
Main Authors: Papp, Csaba, Mukundan, Vineeth Thachappilly, Jenjaroenpun, Piroon, Winnerdy, Fernaldo Richtia, Ow, Ghim Siong, Phan, Anh Tuân, Kuznetsov, Vladimir A.
Other Authors: School of Physical and Mathematical Sciences
Format: Article
Language:English
Published: 2023
Subjects:
Online Access:https://hdl.handle.net/10356/169159
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-169159
record_format dspace
spelling sg-ntu-dr.10356-1691592023-07-10T15:34:32Z Stable bulged G-quadruplexes in the human genome: identification, experimental validation and functionalization Papp, Csaba Mukundan, Vineeth Thachappilly Jenjaroenpun, Piroon Winnerdy, Fernaldo Richtia Ow, Ghim Siong Phan, Anh Tuân Kuznetsov, Vladimir A. School of Physical and Mathematical Sciences NTU Institute of Structural Biology Science::Biological sciences Human Genome Nucleotide Sequence DNA sequence composition determines the topology and stability of G-quadruplexes (G4s). Bulged G-quadruplex structures (G4-Bs) are a subset of G4s characterized by 3D conformations with bulges. Current search algorithms fail to capture stable G4-B, making their genome-wide study infeasible. Here, we introduced a large family of computationally defined and experimentally verified potential G4-B forming sequences (pG4-BS). We found 478 263 pG4-BS regions that do not overlap 'canonical' G4-forming sequences in the human genome and are preferentially localized in transcription regulatory regions including R-loops and open chromatin. Over 90% of protein-coding genes contain pG4-BS in their promoter or gene body. We observed generally higher pG4-BS content in R-loops and their flanks, longer genes that are associated with brain tissue, immune and developmental processes. Also, the presence of pG4-BS on both template and non-template strands in promoters is associated with oncogenesis, cardiovascular disease and stemness. Our G4-BS models predicted G4-forming ability in vitro with 91.5% accuracy. Analysis of G4-seq and CUT&Tag data strongly supports the existence of G4-BS conformations genome-wide. We reconstructed a novel G4-B 3D structure located in the E2F8 promoter. This study defines a large family of G4-like sequences, offering new insights into the essential biological functions and potential future therapeutic uses of G4-B. Agency for Science, Technology and Research (A*STAR) Nanyang Technological University Published version Bioinformatics Institute, Biomedical Institutes/A-STAR, Singapore (in part); V.A.K. was supported by a SUNY EMPIRE innovation program scholar grant; Upstate Medical University Cancer Center grant; Upstate Foundation Turn4ACure Fund; research in A.T.P. lab was supported by Nanyang Technological University Singapore; P.J. was supported by the Office of the Permanent Secretary, Ministry of Higher Education, Science, Research and Innovation (OPS MHESI); Thailand Science Research and Innovation (TSRI) [RGNS 64-161]. Funding for open access charge: SUNY EMPIRE innovation program scholar grant, the Upstate Medical University Cancer Center grant; Upstate Foundation Turn4ACure Fund. 2023-07-04T03:06:32Z 2023-07-04T03:06:32Z 2023 Journal Article Papp, C., Mukundan, V. T., Jenjaroenpun, P., Winnerdy, F. R., Ow, G. S., Phan, A. T. & Kuznetsov, V. A. (2023). Stable bulged G-quadruplexes in the human genome: identification, experimental validation and functionalization. Nucleic Acids Research, 51(9), 4148-4177. https://dx.doi.org/10.1093/nar/gkad252 0305-1048 https://hdl.handle.net/10356/169159 10.1093/nar/gkad252 37094040 2-s2.0-85159779532 9 51 4148 4177 en Nucleic Acids Research © The Author(s) 2023. Published by Oxford University Press on behalf of Nucleic Acids Research. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Science::Biological sciences
Human Genome
Nucleotide Sequence
spellingShingle Science::Biological sciences
Human Genome
Nucleotide Sequence
Papp, Csaba
Mukundan, Vineeth Thachappilly
Jenjaroenpun, Piroon
Winnerdy, Fernaldo Richtia
Ow, Ghim Siong
Phan, Anh Tuân
Kuznetsov, Vladimir A.
Stable bulged G-quadruplexes in the human genome: identification, experimental validation and functionalization
description DNA sequence composition determines the topology and stability of G-quadruplexes (G4s). Bulged G-quadruplex structures (G4-Bs) are a subset of G4s characterized by 3D conformations with bulges. Current search algorithms fail to capture stable G4-B, making their genome-wide study infeasible. Here, we introduced a large family of computationally defined and experimentally verified potential G4-B forming sequences (pG4-BS). We found 478 263 pG4-BS regions that do not overlap 'canonical' G4-forming sequences in the human genome and are preferentially localized in transcription regulatory regions including R-loops and open chromatin. Over 90% of protein-coding genes contain pG4-BS in their promoter or gene body. We observed generally higher pG4-BS content in R-loops and their flanks, longer genes that are associated with brain tissue, immune and developmental processes. Also, the presence of pG4-BS on both template and non-template strands in promoters is associated with oncogenesis, cardiovascular disease and stemness. Our G4-BS models predicted G4-forming ability in vitro with 91.5% accuracy. Analysis of G4-seq and CUT&Tag data strongly supports the existence of G4-BS conformations genome-wide. We reconstructed a novel G4-B 3D structure located in the E2F8 promoter. This study defines a large family of G4-like sequences, offering new insights into the essential biological functions and potential future therapeutic uses of G4-B.
author2 School of Physical and Mathematical Sciences
author_facet School of Physical and Mathematical Sciences
Papp, Csaba
Mukundan, Vineeth Thachappilly
Jenjaroenpun, Piroon
Winnerdy, Fernaldo Richtia
Ow, Ghim Siong
Phan, Anh Tuân
Kuznetsov, Vladimir A.
format Article
author Papp, Csaba
Mukundan, Vineeth Thachappilly
Jenjaroenpun, Piroon
Winnerdy, Fernaldo Richtia
Ow, Ghim Siong
Phan, Anh Tuân
Kuznetsov, Vladimir A.
author_sort Papp, Csaba
title Stable bulged G-quadruplexes in the human genome: identification, experimental validation and functionalization
title_short Stable bulged G-quadruplexes in the human genome: identification, experimental validation and functionalization
title_full Stable bulged G-quadruplexes in the human genome: identification, experimental validation and functionalization
title_fullStr Stable bulged G-quadruplexes in the human genome: identification, experimental validation and functionalization
title_full_unstemmed Stable bulged G-quadruplexes in the human genome: identification, experimental validation and functionalization
title_sort stable bulged g-quadruplexes in the human genome: identification, experimental validation and functionalization
publishDate 2023
url https://hdl.handle.net/10356/169159
_version_ 1772827925757820928