Meta-analysis on the lethality of influenza A viruses using machine learning approaches

Influenza viruses are persistently threatening public health, causing annual epidemics, and sporadic pandemics. The majority of influenza viruses reside among the avian species due to host range restriction. However, some avian strains do acquire the capability to overcome host species barrier to ca...

Full description

Saved in:
Bibliographic Details
Main Author: Yin, Rui
Other Authors: Kwoh Chee Keong
Format: Thesis-Doctor of Philosophy
Language:English
Published: Nanyang Technological University 2020
Subjects:
Online Access:https://hdl.handle.net/10356/140723
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-140723
record_format dspace
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Engineering::Computer science and engineering::Computer applications::Life and medical sciences
spellingShingle Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Engineering::Computer science and engineering::Computer applications::Life and medical sciences
Yin, Rui
Meta-analysis on the lethality of influenza A viruses using machine learning approaches
description Influenza viruses are persistently threatening public health, causing annual epidemics, and sporadic pandemics. The majority of influenza viruses reside among the avian species due to host range restriction. However, some avian strains do acquire the capability to overcome host species barrier to cause human infections due to mutations and reassortments. These novel influenza strains may cause high mortality and morbidity. The main target of this thesis is to analyze the lethality of influenza A virus by profiling its virulence and antigenicity using machine learning approaches. Firstly, potential critical virulent sites are investigated based on the hemagglutinin of the influenza A virus using past pandemic strains. Three rule-based algorithms are utilized to classify the pandemic and non-pandemic strains and extract the rules. These rules consist of mutations that occurred on the potential critical virulent sites. Fourteen out of the sixteen sites detected by our experiments are validated as receptor binding sites or antigenic sites. Secondly, a host tropism framework to predict avian, human and swine strains. Seven physicochemical properties are used to generate features through Amino Acid Composition (AAC) and global descriptor CTD (Composition, Transition, Distribution). A novel computational method named HopPER is then developed based on the host tropism prediction system through random forest. In addition to the accurate prediction of reassortment for complete genomes, HopPER also demonstrates its effectiveness on incomplete genomes. The analysis of the evolutionary patterns of avian, human and swine strains by HopPER has further revealed the reassortment history of the influenza viruses. Thirdly, an integrative model is created to predict influenza virulence, incorporating prior mutation and reassortment information of influenza viruses. Using the mouse lethal dose 50, the virulence of the infections is classified as avirulent and virulent. The prior information on mutation and reassortment of input genomes are obtained by the previous computational models. By integrating this prior knowledge into all the predictive models using the posterior regularization technique, the proposed framework can improve the performance of virulence prediction. The experimental results validate the effectiveness of our proposed framework for virulence prediction. Moreover, the importance weights of the prior viral information will assist biologists to gain a better understanding of how the mutations influence the degree of virulence. Lastly, it is shown that antigenicity is another crucial factor reflecting viral lethality. A novel algorithm is proposed to predict influenza antigenic variants of influenza A viruses via a 2D convolutional neural network. Specifically, the introduction of a new distributed representation makes it possible to deal with sequence and antigenic data of influenza strains. The squeeze-and-excitation mechanisms are integrated into the convolutional neural networks (CNNs), which enables networks to focus on informative residue features. Experimental results on three different influenza datasets have demonstrated superior performance over the existing state-of-the-art computational models. In summary, this thesis elaborates novel methodologies for the analysis of the lethality of influenza A virus through predicting its virulence and antigenicity using machine learning approaches. This offers an improvement on existing influenza virologic surveillance and provides an early warning of an impending outbreak.
author2 Kwoh Chee Keong
author_facet Kwoh Chee Keong
Yin, Rui
format Thesis-Doctor of Philosophy
author Yin, Rui
author_sort Yin, Rui
title Meta-analysis on the lethality of influenza A viruses using machine learning approaches
title_short Meta-analysis on the lethality of influenza A viruses using machine learning approaches
title_full Meta-analysis on the lethality of influenza A viruses using machine learning approaches
title_fullStr Meta-analysis on the lethality of influenza A viruses using machine learning approaches
title_full_unstemmed Meta-analysis on the lethality of influenza A viruses using machine learning approaches
title_sort meta-analysis on the lethality of influenza a viruses using machine learning approaches
publisher Nanyang Technological University
publishDate 2020
url https://hdl.handle.net/10356/140723
_version_ 1683493515965759488
spelling sg-ntu-dr.10356-1407232020-10-28T08:40:51Z Meta-analysis on the lethality of influenza A viruses using machine learning approaches Yin, Rui Kwoh Chee Keong School of Computer Science and Engineering Bioinformatics Research Centre ASCKKWOH@ntu.edu.sg Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Engineering::Computer science and engineering::Computer applications::Life and medical sciences Influenza viruses are persistently threatening public health, causing annual epidemics, and sporadic pandemics. The majority of influenza viruses reside among the avian species due to host range restriction. However, some avian strains do acquire the capability to overcome host species barrier to cause human infections due to mutations and reassortments. These novel influenza strains may cause high mortality and morbidity. The main target of this thesis is to analyze the lethality of influenza A virus by profiling its virulence and antigenicity using machine learning approaches. Firstly, potential critical virulent sites are investigated based on the hemagglutinin of the influenza A virus using past pandemic strains. Three rule-based algorithms are utilized to classify the pandemic and non-pandemic strains and extract the rules. These rules consist of mutations that occurred on the potential critical virulent sites. Fourteen out of the sixteen sites detected by our experiments are validated as receptor binding sites or antigenic sites. Secondly, a host tropism framework to predict avian, human and swine strains. Seven physicochemical properties are used to generate features through Amino Acid Composition (AAC) and global descriptor CTD (Composition, Transition, Distribution). A novel computational method named HopPER is then developed based on the host tropism prediction system through random forest. In addition to the accurate prediction of reassortment for complete genomes, HopPER also demonstrates its effectiveness on incomplete genomes. The analysis of the evolutionary patterns of avian, human and swine strains by HopPER has further revealed the reassortment history of the influenza viruses. Thirdly, an integrative model is created to predict influenza virulence, incorporating prior mutation and reassortment information of influenza viruses. Using the mouse lethal dose 50, the virulence of the infections is classified as avirulent and virulent. The prior information on mutation and reassortment of input genomes are obtained by the previous computational models. By integrating this prior knowledge into all the predictive models using the posterior regularization technique, the proposed framework can improve the performance of virulence prediction. The experimental results validate the effectiveness of our proposed framework for virulence prediction. Moreover, the importance weights of the prior viral information will assist biologists to gain a better understanding of how the mutations influence the degree of virulence. Lastly, it is shown that antigenicity is another crucial factor reflecting viral lethality. A novel algorithm is proposed to predict influenza antigenic variants of influenza A viruses via a 2D convolutional neural network. Specifically, the introduction of a new distributed representation makes it possible to deal with sequence and antigenic data of influenza strains. The squeeze-and-excitation mechanisms are integrated into the convolutional neural networks (CNNs), which enables networks to focus on informative residue features. Experimental results on three different influenza datasets have demonstrated superior performance over the existing state-of-the-art computational models. In summary, this thesis elaborates novel methodologies for the analysis of the lethality of influenza A virus through predicting its virulence and antigenicity using machine learning approaches. This offers an improvement on existing influenza virologic surveillance and provides an early warning of an impending outbreak. Doctor of Philosophy 2020-06-01T10:14:23Z 2020-06-01T10:14:23Z 2020 Thesis-Doctor of Philosophy Yin, R. (2020). Meta-analysis on the lethality of influenza A viruses using machine learning approaches. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/140723 10.32657/10356/140723 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University