Meta-analysis on the lethality of influenza A viruses using machine learning approaches

Influenza viruses are persistently threatening public health, causing annual epidemics, and sporadic pandemics. The majority of influenza viruses reside among the avian species due to host range restriction. However, some avian strains do acquire the capability to overcome host species barrier to ca...

Full description

Saved in:
Bibliographic Details
Main Author: Yin, Rui
Other Authors: Kwoh Chee Keong
Format: Thesis-Doctor of Philosophy
Language:English
Published: Nanyang Technological University 2020
Subjects:
Online Access:https://hdl.handle.net/10356/140723
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Influenza viruses are persistently threatening public health, causing annual epidemics, and sporadic pandemics. The majority of influenza viruses reside among the avian species due to host range restriction. However, some avian strains do acquire the capability to overcome host species barrier to cause human infections due to mutations and reassortments. These novel influenza strains may cause high mortality and morbidity. The main target of this thesis is to analyze the lethality of influenza A virus by profiling its virulence and antigenicity using machine learning approaches. Firstly, potential critical virulent sites are investigated based on the hemagglutinin of the influenza A virus using past pandemic strains. Three rule-based algorithms are utilized to classify the pandemic and non-pandemic strains and extract the rules. These rules consist of mutations that occurred on the potential critical virulent sites. Fourteen out of the sixteen sites detected by our experiments are validated as receptor binding sites or antigenic sites. Secondly, a host tropism framework to predict avian, human and swine strains. Seven physicochemical properties are used to generate features through Amino Acid Composition (AAC) and global descriptor CTD (Composition, Transition, Distribution). A novel computational method named HopPER is then developed based on the host tropism prediction system through random forest. In addition to the accurate prediction of reassortment for complete genomes, HopPER also demonstrates its effectiveness on incomplete genomes. The analysis of the evolutionary patterns of avian, human and swine strains by HopPER has further revealed the reassortment history of the influenza viruses. Thirdly, an integrative model is created to predict influenza virulence, incorporating prior mutation and reassortment information of influenza viruses. Using the mouse lethal dose 50, the virulence of the infections is classified as avirulent and virulent. The prior information on mutation and reassortment of input genomes are obtained by the previous computational models. By integrating this prior knowledge into all the predictive models using the posterior regularization technique, the proposed framework can improve the performance of virulence prediction. The experimental results validate the effectiveness of our proposed framework for virulence prediction. Moreover, the importance weights of the prior viral information will assist biologists to gain a better understanding of how the mutations influence the degree of virulence. Lastly, it is shown that antigenicity is another crucial factor reflecting viral lethality. A novel algorithm is proposed to predict influenza antigenic variants of influenza A viruses via a 2D convolutional neural network. Specifically, the introduction of a new distributed representation makes it possible to deal with sequence and antigenic data of influenza strains. The squeeze-and-excitation mechanisms are integrated into the convolutional neural networks (CNNs), which enables networks to focus on informative residue features. Experimental results on three different influenza datasets have demonstrated superior performance over the existing state-of-the-art computational models. In summary, this thesis elaborates novel methodologies for the analysis of the lethality of influenza A virus through predicting its virulence and antigenicity using machine learning approaches. This offers an improvement on existing influenza virologic surveillance and provides an early warning of an impending outbreak.