Development of interpretable spiking neural network for multiclass classification

Spiking Neural Networks (SNNs) are the third generation of artificial neural networks, which process inputs asynchronously, through spikes. A spike is a discrete event in the temporal domain. This provides an additional dimension of time in SNN for processing the inputs. SNN's way of informatio...

Full description

Saved in:
Bibliographic Details
Main Author: Jeyasothy, Abeegithan
Other Authors: Quek Hiok Chai
Format: Thesis-Doctor of Philosophy
Language:English
Published: Nanyang Technological University 2022
Subjects:
Online Access:https://hdl.handle.net/10356/156352
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-156352
record_format dspace
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Engineering::Computer science and engineering::Computing methodologies::Pattern recognition
Engineering::Computer science and engineering::Theory of computation::Computation by abstract devices
spellingShingle Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Engineering::Computer science and engineering::Computing methodologies::Pattern recognition
Engineering::Computer science and engineering::Theory of computation::Computation by abstract devices
Jeyasothy, Abeegithan
Development of interpretable spiking neural network for multiclass classification
description Spiking Neural Networks (SNNs) are the third generation of artificial neural networks, which process inputs asynchronously, through spikes. A spike is a discrete event in the temporal domain. This provides an additional dimension of time in SNN for processing the inputs. SNN's way of information processing is more biologically realistic and computationally more powerful than Analog Neural Networks (An-NNs). This paved the way for the development of SNNs to mimic the brains' information processing capability for several cognitive tasks. This thesis is aimed at addressing the gaps in the interpretation capability of SNNs, while also improving their generalization capabilities with a less computational load. In SNNs, the weight model in between two spiking neurons modulates the amplitude of the information propagating from one neuron to the other. The majority of the developments in SNNs use a single weight model. The single weight models are adapted from long-term plasticity models in biology. In the SNN context, it means the weight does not change after the training process. However, many types of dynamic synaptic plasticity models exist in the biological neural network to modulate information propagation. In the SNN context, a dynamic synaptic plasticity model can be interpreted as a dynamic weight that changes for different input spikes even after the training process. These dynamic weights improve the computational power of the networks with asynchronous inputs (such as SNNs), as the asynchronicity of the inputs can fully utilize the dynamic nature of the weights. However, the adaptation of dynamic weight in a supervised learning framework is a bottleneck, as during the learning phase only a single weight is learned and the dynamics of the weight evoked by the input spikes are pre-defined. The usage of dynamic weights in SNNs became dormant mainly due to the heavy computational load associated with modeling the dynamics of the weight for every incoming spike. Instead of directly adapting the dynamic weights from biology, this thesis proposes a learnable time-varying weight model that is suitable for SNNs. The time-varying weight model is a continuous function that is learned through a supervised learning framework. This time-varying weight model has the characteristics of both the long-term plasticity model and the dynamic synaptic plasticity model. The learned function does not change after the training process and it will have different weights for different input spikes. The time-varying weights utilize the asynchronicity in the inputs in a more meaningful manner and change the connectivity between the spiking neurons. This opens up an entirely new area of development of learning algorithms suitable for time-varying weight models as the learning algorithms that were developed for SNN with single weight model cannot be directly applied to train the SNN with time-varying weights. To this end, this thesis proposes Synaptic Efficacy Function-based leaky-integrate-and-fire neuRON (SEFRON), a single output neuron binary classifier with time-varying weights. In SEFRON, the input neurons are directly connected to one output neuron via time-varying weights. The real-valued inputs are converted to spike times using the population encoding scheme (temporal encoding). A normalized spike-timing-dependent plasticity (STDP) rule is developed to train the SEFRON. The output spike time is split into two regions to determine the predicted class label for a given input. The normalized STDP rule determines a single value weight update and it is embedded in a Gaussian function to produce the time-varying weight update. The centre of the Gaussian function is the same as the input spike time. The resultant time-varying weight is equivalent to the summation of multiple amplitude-modulated Gaussian functions with their centers located at different times. Other weight models can have either positive or negative weight values, whereas the time-varying weight models can have both negative and positive weight values. This enables the time-varying weight model to encode more information in a single link. The performance of SEFRON is evaluated in terms of architecture, computational time, and accuracy. The performance study results show that SEFRON with single neuron and time-varying weight has comparable performance to that of multi-layer and multi-neuron SNN classifiers with single weight models and dynamic weight models. The high performance of SEFRON with a single neuron is attributed to the computational power of the time-varying weights. SEFRON is limited to binary classification tasks, hence this thesis proposes to extend the usage of the time-varying weight model in SNN to handle multiclass classification problems. For multiclass classification, first, the single neuron architecture is expanded to Spiking Neural Network with time-varying weights (SNN-t). The predicted class label for a given input is determined by the output neuron that fires a spike earlier than other neurons. Encoding real-valued inputs to spike times and producing time-varying weight updates from the single value weight update follow the same procedure as in SEFRON. This thesis proposes three algorithms to train the SNN-t architecture for multiclass classification problems. First, the normalized STDP based algorithm developed for SEFRON is modified to suit to multiclass setting (Mc-SEFRON). Secondly, a meta-neuron-based learning algorithm is developed (MeST) to improve the generalization ability of SNN-t. Finally, a gradient descent-based learning algorithm is developed (GradST) to improve the generalization ability and also to handle big datasets. The error from one layer to another layer has to be heuristically calculated for Mc-SEFRON and MeST. This makes them most suitable for shallow architectures with single learnable layers. GradST is scalable to multi-layer architectures. The performance of Mc-SEFRON, MeST, and GradST are evaluated on UCI benchmark datasets. MeST has a better performance for small datasets and GradST has superior performance for big datasets. The scalability of GradST is demonstrated on MNIST, JAFFE, and CIFAR10 image datasets. On MNIST and JAFFE datasets, even though the performance of GradST is lower than SOTA, the performance of GradST is much superior to an SNN with a single weight model and the same architecture as SNN-t. A hybrid model with ResNet50 and SNN-t is built for the CIFAR10 dataset. The hybrid model slightly improves the accuracy of the baseline ResNet model and also significantly improves the robustness of the model for gradient-based adversarial attacks. Subsequently, this thesis addresses the interpretation of the predictions made by spiking neural networks for multiclass classification. This thesis proposes a weight transformation method to transform the weighted spike response in the temporal domain to feature space to develop a Generalized Additive Model (GAM). The GAMs are inherently interpretable and are predominantly used for binary classification problems. The GAM obtained from the SNN-t is referred to as the Spiking Additive Model (SAM). In a multiclass setting, the inherent interpretability of GAMs diminishes due to the presence of multiple shape functions. This thesis also proposes a postprocessing method for multiclass GAMs to enhance the interpretability of multiclass GAMs. This postprocessing method improves the visualization of multiple shape functions to provide relative interpretation for multiclass classification problems. The performance of SAMs obtained from Mc-SEFRON, MeST, and GradST are evaluated on large UCI benchmark datasets. The difference in performance between SAM and its corresponding SNN-t classifier is very minimal, indicating the effectiveness of the weight transformation method. Finally, the methods proposed in this thesis are applied to solve real-world credit scoring problems. The SNN-t classifiers have the lowest class bias and perform better than all the other non-interpretable classifiers, including the shallow classifiers with deep learning methods. The performance and the advantages of the SNN-t classifiers are preserved in their respective SAMs. The SAMs have superior performance compared to other interpretable classifiers. The performance of SAM can be improved by improving the performance of SNN-t. The high-performance, lower-class bias, and interpretability of SAMs make them the more favorable choice for high-stake decision-making applications.
author2 Quek Hiok Chai
author_facet Quek Hiok Chai
Jeyasothy, Abeegithan
format Thesis-Doctor of Philosophy
author Jeyasothy, Abeegithan
author_sort Jeyasothy, Abeegithan
title Development of interpretable spiking neural network for multiclass classification
title_short Development of interpretable spiking neural network for multiclass classification
title_full Development of interpretable spiking neural network for multiclass classification
title_fullStr Development of interpretable spiking neural network for multiclass classification
title_full_unstemmed Development of interpretable spiking neural network for multiclass classification
title_sort development of interpretable spiking neural network for multiclass classification
publisher Nanyang Technological University
publishDate 2022
url https://hdl.handle.net/10356/156352
_version_ 1734310103166746624
spelling sg-ntu-dr.10356-1563522022-05-04T10:23:15Z Development of interpretable spiking neural network for multiclass classification Jeyasothy, Abeegithan Quek Hiok Chai School of Computer Science and Engineering A*STAR Suresh Sundaram Savitha Ramasamy ASHCQUEK@ntu.edu.sg Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Engineering::Computer science and engineering::Computing methodologies::Pattern recognition Engineering::Computer science and engineering::Theory of computation::Computation by abstract devices Spiking Neural Networks (SNNs) are the third generation of artificial neural networks, which process inputs asynchronously, through spikes. A spike is a discrete event in the temporal domain. This provides an additional dimension of time in SNN for processing the inputs. SNN's way of information processing is more biologically realistic and computationally more powerful than Analog Neural Networks (An-NNs). This paved the way for the development of SNNs to mimic the brains' information processing capability for several cognitive tasks. This thesis is aimed at addressing the gaps in the interpretation capability of SNNs, while also improving their generalization capabilities with a less computational load. In SNNs, the weight model in between two spiking neurons modulates the amplitude of the information propagating from one neuron to the other. The majority of the developments in SNNs use a single weight model. The single weight models are adapted from long-term plasticity models in biology. In the SNN context, it means the weight does not change after the training process. However, many types of dynamic synaptic plasticity models exist in the biological neural network to modulate information propagation. In the SNN context, a dynamic synaptic plasticity model can be interpreted as a dynamic weight that changes for different input spikes even after the training process. These dynamic weights improve the computational power of the networks with asynchronous inputs (such as SNNs), as the asynchronicity of the inputs can fully utilize the dynamic nature of the weights. However, the adaptation of dynamic weight in a supervised learning framework is a bottleneck, as during the learning phase only a single weight is learned and the dynamics of the weight evoked by the input spikes are pre-defined. The usage of dynamic weights in SNNs became dormant mainly due to the heavy computational load associated with modeling the dynamics of the weight for every incoming spike. Instead of directly adapting the dynamic weights from biology, this thesis proposes a learnable time-varying weight model that is suitable for SNNs. The time-varying weight model is a continuous function that is learned through a supervised learning framework. This time-varying weight model has the characteristics of both the long-term plasticity model and the dynamic synaptic plasticity model. The learned function does not change after the training process and it will have different weights for different input spikes. The time-varying weights utilize the asynchronicity in the inputs in a more meaningful manner and change the connectivity between the spiking neurons. This opens up an entirely new area of development of learning algorithms suitable for time-varying weight models as the learning algorithms that were developed for SNN with single weight model cannot be directly applied to train the SNN with time-varying weights. To this end, this thesis proposes Synaptic Efficacy Function-based leaky-integrate-and-fire neuRON (SEFRON), a single output neuron binary classifier with time-varying weights. In SEFRON, the input neurons are directly connected to one output neuron via time-varying weights. The real-valued inputs are converted to spike times using the population encoding scheme (temporal encoding). A normalized spike-timing-dependent plasticity (STDP) rule is developed to train the SEFRON. The output spike time is split into two regions to determine the predicted class label for a given input. The normalized STDP rule determines a single value weight update and it is embedded in a Gaussian function to produce the time-varying weight update. The centre of the Gaussian function is the same as the input spike time. The resultant time-varying weight is equivalent to the summation of multiple amplitude-modulated Gaussian functions with their centers located at different times. Other weight models can have either positive or negative weight values, whereas the time-varying weight models can have both negative and positive weight values. This enables the time-varying weight model to encode more information in a single link. The performance of SEFRON is evaluated in terms of architecture, computational time, and accuracy. The performance study results show that SEFRON with single neuron and time-varying weight has comparable performance to that of multi-layer and multi-neuron SNN classifiers with single weight models and dynamic weight models. The high performance of SEFRON with a single neuron is attributed to the computational power of the time-varying weights. SEFRON is limited to binary classification tasks, hence this thesis proposes to extend the usage of the time-varying weight model in SNN to handle multiclass classification problems. For multiclass classification, first, the single neuron architecture is expanded to Spiking Neural Network with time-varying weights (SNN-t). The predicted class label for a given input is determined by the output neuron that fires a spike earlier than other neurons. Encoding real-valued inputs to spike times and producing time-varying weight updates from the single value weight update follow the same procedure as in SEFRON. This thesis proposes three algorithms to train the SNN-t architecture for multiclass classification problems. First, the normalized STDP based algorithm developed for SEFRON is modified to suit to multiclass setting (Mc-SEFRON). Secondly, a meta-neuron-based learning algorithm is developed (MeST) to improve the generalization ability of SNN-t. Finally, a gradient descent-based learning algorithm is developed (GradST) to improve the generalization ability and also to handle big datasets. The error from one layer to another layer has to be heuristically calculated for Mc-SEFRON and MeST. This makes them most suitable for shallow architectures with single learnable layers. GradST is scalable to multi-layer architectures. The performance of Mc-SEFRON, MeST, and GradST are evaluated on UCI benchmark datasets. MeST has a better performance for small datasets and GradST has superior performance for big datasets. The scalability of GradST is demonstrated on MNIST, JAFFE, and CIFAR10 image datasets. On MNIST and JAFFE datasets, even though the performance of GradST is lower than SOTA, the performance of GradST is much superior to an SNN with a single weight model and the same architecture as SNN-t. A hybrid model with ResNet50 and SNN-t is built for the CIFAR10 dataset. The hybrid model slightly improves the accuracy of the baseline ResNet model and also significantly improves the robustness of the model for gradient-based adversarial attacks. Subsequently, this thesis addresses the interpretation of the predictions made by spiking neural networks for multiclass classification. This thesis proposes a weight transformation method to transform the weighted spike response in the temporal domain to feature space to develop a Generalized Additive Model (GAM). The GAMs are inherently interpretable and are predominantly used for binary classification problems. The GAM obtained from the SNN-t is referred to as the Spiking Additive Model (SAM). In a multiclass setting, the inherent interpretability of GAMs diminishes due to the presence of multiple shape functions. This thesis also proposes a postprocessing method for multiclass GAMs to enhance the interpretability of multiclass GAMs. This postprocessing method improves the visualization of multiple shape functions to provide relative interpretation for multiclass classification problems. The performance of SAMs obtained from Mc-SEFRON, MeST, and GradST are evaluated on large UCI benchmark datasets. The difference in performance between SAM and its corresponding SNN-t classifier is very minimal, indicating the effectiveness of the weight transformation method. Finally, the methods proposed in this thesis are applied to solve real-world credit scoring problems. The SNN-t classifiers have the lowest class bias and perform better than all the other non-interpretable classifiers, including the shallow classifiers with deep learning methods. The performance and the advantages of the SNN-t classifiers are preserved in their respective SAMs. The SAMs have superior performance compared to other interpretable classifiers. The performance of SAM can be improved by improving the performance of SNN-t. The high-performance, lower-class bias, and interpretability of SAMs make them the more favorable choice for high-stake decision-making applications. Doctor of Philosophy 2022-04-13T06:22:42Z 2022-04-13T06:22:42Z 2021 Thesis-Doctor of Philosophy Jeyasothy, A. (2021). Development of interpretable spiking neural network for multiclass classification. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/156352 https://hdl.handle.net/10356/156352 10.32657/10356/156352 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University