Noncanonical registers and base pairs in human 5′ splice-site selection

Accurate recognition of splice sites is essential for pre-messenger RNA splicing. Mammalian 5′ splice sites are mainly recognized by canonical base-pairing to the 5′ end of U1 small nuclear RNA, yet we described multiple noncanonical base-pairing registers by shifting base-pair positions or allowing...

Full description

Saved in:
Bibliographic Details
Main Authors: Tan, Jiazi, Ho, Jessie Jia Xin, Zhong, Zhensheng, Luo, Shufang, Chen, Gang, Roca, Xavier
Other Authors: School of Biological Sciences
Format: Article
Language:English
Published: 2018
Subjects:
RNA
Online Access:https://hdl.handle.net/10356/89052
http://hdl.handle.net/10220/46040
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Accurate recognition of splice sites is essential for pre-messenger RNA splicing. Mammalian 5′ splice sites are mainly recognized by canonical base-pairing to the 5′ end of U1 small nuclear RNA, yet we described multiple noncanonical base-pairing registers by shifting base-pair positions or allowing one-nucleotide bulges. By systematic mutational and suppressor U1 analyses, we prove three registers involving asymmetric loops and show that two-nucleotide bulges but not longer can form in this context. Importantly, we established that a noncanonical uridine-pseudouridine interaction in the 5′ splice site/U1 helix contributes to the recognition of certain 5′ splice sites. Thermal melting experiments support the formation of noncanonical registers and uridine-pseudouridine interactions. Overall, we experimentally validated or discarded the majority of predicted noncanonical registers, to derive a list of 5′ splice sites using such alternative mechanisms that is much different from the original. This study allows not only the mechanistic understanding of the recognition of a wide diversity of mammalian 5′ splice sites, but also the future development of better splice-site scoring methods that reliably predict the effects of disease-causing mutations at these sequences.