Characterization and prediction of B-cell epitopes

A B-cell epitope is a set of antigen surface residues which can be recognized by an antibody. Identifying epitopes facilitates the understanding of the basic recognition mechanism of immune responses, which in turn guides disease diagnosis, vaccine design and drug development. However, the identific...

Full description

Saved in:
Bibliographic Details
Main Author: Liang, Zhao
Other Authors: Hoi Chu Hong
Format: Theses and Dissertations
Language:English
Published: 2013
Subjects:
Online Access:https://hdl.handle.net/10356/54752
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-54752
record_format dspace
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering::Computer applications::Life and medical sciences
DRNTU::Science::Biological sciences::Biomathematics
spellingShingle DRNTU::Engineering::Computer science and engineering::Computer applications::Life and medical sciences
DRNTU::Science::Biological sciences::Biomathematics
Liang, Zhao
Characterization and prediction of B-cell epitopes
description A B-cell epitope is a set of antigen surface residues which can be recognized by an antibody. Identifying epitopes facilitates the understanding of the basic recognition mechanism of immune responses, which in turn guides disease diagnosis, vaccine design and drug development. However, the identification of epitopes is challenging due to the complicated nature of antigen-antibody interactions as well as the context-awareness principle behind the interaction. Context-awareness highlights the existence of multiple epitopes in antigens and the reconfiguration of epitope residues when an antigen interacts with a different antibody. A coarse binary classification of antigen regions into epitopes or non-epitopes without specifying antibodies may not accurately reflect this biological reality. To accurately and rationally detect the epitopes of an antigen, comprehensive analysis of multiple epitopes is carried out. This is followed by antibody-specific epitope prediction in line with the principle of context-awareness and antibody-agnostic epitope prediction which is capable of predicting one or multiple epitopes that are consistent with the rule of context-dependence. A multi-interface domain is one that can shape multiple and distinctive binding sites to contact with many other domains, forming a hub in domain-domain interaction networks. Graph theory and algorithms are applied to discover fingerprints of interfaces, explore relations between interfaces, and establish associations between interfaces and their functions of multi-interface domains retrieved from the PDB. Experimental results show that about 40% of proteins have multiple interfaces; however, the involved multi-interface domains account for only a tiny fraction (1.8%) of the total number of domains. The interfaces of these multi-interface domains are distinguishable in terms of their fingerprints, indicating the functional specificity of the multiple interfaces in a domain. Furthermore, both cooperative and distinctive structural patterns are observed in the interfaces of multi-interface domains. Based on the fact that multiple interfaces exist in antigens and that these interfaces associate with different antibodies, a two-dimensional association-based model is established to predict antibody-specific epitopes. The two kinds of associations revealing the contextual awareness are: (i) residues-residues pairing preference, and (ii) the dependence between sets of contacting residue pairs. Preference plays a bridging role to link interacting paratope and epitope residues, while dependence is used to infer new interacting residue pairs. Experiments conducted on a non-redundant data set containing 80 antibody-antigen structural complexes have found that the proposed model yields good performance in antibody-specific epitope prediction. In addition, this model predicts antibody-specific epitopes from antigen-antibody sequences, although it is trained on antigen-antibody structural complexes, hence indicating its broad applicability in epitope prediction. The two-dimensional association can capture the context-awareness of paratope-epitope interacting complexes, but it cannot cover the contacts within a paratope or an epitope. Thus, a new concept --- coupling graph --- is introduced to include both inter-protein contacts between a paratope and an epitope as well as intra-protein contacts within a paratope or an epitope. The coupling graph is a two-layered graph with each node in one graph connecting with nodes in the other graph. The coupling graph can represent the context-awareness principle well; however, it is very challenging to mine frequent coupling subgraphs which are used to reveal the context-awareness. Therefore, a new algorithm for coupling graph mining, based on graph transformation, has been designed. Experiments show that the innovative algorithm significantly reduces the time cost and memory consumption in coupling graph mining, and its application in antibody-specific epitope prediction outperforms the association-based model. A novel graph based on the antibody-agnostic epitope prediction model is built to predict one or multiple epitopes of an antigen, which overcomes the problem that existing models predict all the antigenic residues of an antigen as a single epitope although these antigenic residues may belong to totally different epitopes. This model divides an antigen surface graph into subgraphs by using the Markov Clustering algorithm, and then a classifier is constructed to distinguish these subgraphs as epitopes or non-epitopes. The classifier is then taken to predict epitopes for a test antigen. On a big data set of 92 antigen-antibody PDB complexes, the proposed method significantly outperforms the state-of-the-art epitope prediction methods, achieving 24.7% higher averaged f-score than the best existing model. In particular, this model performs equally well on protrusive epitopes and planar epitopes which are hardly addressed by existing models. Furthermore, it can also detect multiple epitopes whenever they exist. In summary, we have comprehensively analyzed the property of multi-interfaces from both structural and functional perspectives, and have built both antibody-specific and antibody-agnostic epitope prediction models which are consistent with the principle of context-awareness. We observe that multi-interface proteins are ubiquitous, consolidating the principle of context-awareness. Both the association-based and the coupling graph-based antibody-specific epitope prediction models are effective, and the graph based antibody-agnostic epitope prediction model significantly improves prediction performance by identifying one or multiple epitopes.
author2 Hoi Chu Hong
author_facet Hoi Chu Hong
Liang, Zhao
format Theses and Dissertations
author Liang, Zhao
author_sort Liang, Zhao
title Characterization and prediction of B-cell epitopes
title_short Characterization and prediction of B-cell epitopes
title_full Characterization and prediction of B-cell epitopes
title_fullStr Characterization and prediction of B-cell epitopes
title_full_unstemmed Characterization and prediction of B-cell epitopes
title_sort characterization and prediction of b-cell epitopes
publishDate 2013
url https://hdl.handle.net/10356/54752
_version_ 1759855807664488448
spelling sg-ntu-dr.10356-547522023-03-04T00:48:39Z Characterization and prediction of B-cell epitopes Liang, Zhao Hoi Chu Hong School of Computer Engineering Bioinformatics Research Centre DRNTU::Engineering::Computer science and engineering::Computer applications::Life and medical sciences DRNTU::Science::Biological sciences::Biomathematics A B-cell epitope is a set of antigen surface residues which can be recognized by an antibody. Identifying epitopes facilitates the understanding of the basic recognition mechanism of immune responses, which in turn guides disease diagnosis, vaccine design and drug development. However, the identification of epitopes is challenging due to the complicated nature of antigen-antibody interactions as well as the context-awareness principle behind the interaction. Context-awareness highlights the existence of multiple epitopes in antigens and the reconfiguration of epitope residues when an antigen interacts with a different antibody. A coarse binary classification of antigen regions into epitopes or non-epitopes without specifying antibodies may not accurately reflect this biological reality. To accurately and rationally detect the epitopes of an antigen, comprehensive analysis of multiple epitopes is carried out. This is followed by antibody-specific epitope prediction in line with the principle of context-awareness and antibody-agnostic epitope prediction which is capable of predicting one or multiple epitopes that are consistent with the rule of context-dependence. A multi-interface domain is one that can shape multiple and distinctive binding sites to contact with many other domains, forming a hub in domain-domain interaction networks. Graph theory and algorithms are applied to discover fingerprints of interfaces, explore relations between interfaces, and establish associations between interfaces and their functions of multi-interface domains retrieved from the PDB. Experimental results show that about 40% of proteins have multiple interfaces; however, the involved multi-interface domains account for only a tiny fraction (1.8%) of the total number of domains. The interfaces of these multi-interface domains are distinguishable in terms of their fingerprints, indicating the functional specificity of the multiple interfaces in a domain. Furthermore, both cooperative and distinctive structural patterns are observed in the interfaces of multi-interface domains. Based on the fact that multiple interfaces exist in antigens and that these interfaces associate with different antibodies, a two-dimensional association-based model is established to predict antibody-specific epitopes. The two kinds of associations revealing the contextual awareness are: (i) residues-residues pairing preference, and (ii) the dependence between sets of contacting residue pairs. Preference plays a bridging role to link interacting paratope and epitope residues, while dependence is used to infer new interacting residue pairs. Experiments conducted on a non-redundant data set containing 80 antibody-antigen structural complexes have found that the proposed model yields good performance in antibody-specific epitope prediction. In addition, this model predicts antibody-specific epitopes from antigen-antibody sequences, although it is trained on antigen-antibody structural complexes, hence indicating its broad applicability in epitope prediction. The two-dimensional association can capture the context-awareness of paratope-epitope interacting complexes, but it cannot cover the contacts within a paratope or an epitope. Thus, a new concept --- coupling graph --- is introduced to include both inter-protein contacts between a paratope and an epitope as well as intra-protein contacts within a paratope or an epitope. The coupling graph is a two-layered graph with each node in one graph connecting with nodes in the other graph. The coupling graph can represent the context-awareness principle well; however, it is very challenging to mine frequent coupling subgraphs which are used to reveal the context-awareness. Therefore, a new algorithm for coupling graph mining, based on graph transformation, has been designed. Experiments show that the innovative algorithm significantly reduces the time cost and memory consumption in coupling graph mining, and its application in antibody-specific epitope prediction outperforms the association-based model. A novel graph based on the antibody-agnostic epitope prediction model is built to predict one or multiple epitopes of an antigen, which overcomes the problem that existing models predict all the antigenic residues of an antigen as a single epitope although these antigenic residues may belong to totally different epitopes. This model divides an antigen surface graph into subgraphs by using the Markov Clustering algorithm, and then a classifier is constructed to distinguish these subgraphs as epitopes or non-epitopes. The classifier is then taken to predict epitopes for a test antigen. On a big data set of 92 antigen-antibody PDB complexes, the proposed method significantly outperforms the state-of-the-art epitope prediction methods, achieving 24.7% higher averaged f-score than the best existing model. In particular, this model performs equally well on protrusive epitopes and planar epitopes which are hardly addressed by existing models. Furthermore, it can also detect multiple epitopes whenever they exist. In summary, we have comprehensively analyzed the property of multi-interfaces from both structural and functional perspectives, and have built both antibody-specific and antibody-agnostic epitope prediction models which are consistent with the principle of context-awareness. We observe that multi-interface proteins are ubiquitous, consolidating the principle of context-awareness. Both the association-based and the coupling graph-based antibody-specific epitope prediction models are effective, and the graph based antibody-agnostic epitope prediction model significantly improves prediction performance by identifying one or multiple epitopes. DOCTOR OF PHILOSOPHY (SCE) 2013-08-02T06:25:54Z 2013-08-02T06:25:54Z 2013 2013 Thesis Zhao, L. (2013). Characterization and prediction of B-cell epitopes. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/54752 10.32657/10356/54752 en 185 p. application/pdf