Using mutual information to evaluate the generalization capability of deep learning neural networks

There is a need to better understand how generalization works in a deep learning model. The goal of this paper is to provide a clearer view of the black box called neural network. This is done by using information theory to compute the flow of information within a network. The proposed framework use...

Full description

Saved in:

Bibliographic Details
Main Author:	Kan, Shawn Jung Tze
Other Authors:	Althea Liang
Format:	Final Year Project
Language:	English
Published:	Nanyang Technological University 2020
Subjects:	Engineering::Computer science and engineering::Information systems
Online Access:	https://hdl.handle.net/10356/137910
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-137910
record_format	dspace
spelling	sg-ntu-dr.10356-1379102020-04-18T03:35:27Z Using mutual information to evaluate the generalization capability of deep learning neural networks Kan, Shawn Jung Tze Althea Liang School of Computer Science and Engineering qhliang@ntu.edu.sg Engineering::Computer science and engineering::Information systems There is a need to better understand how generalization works in a deep learning model. The goal of this paper is to provide a clearer view of the black box called neural network. This is done by using information theory to compute the flow of information within a network. The proposed framework uses an indicator that computes the mutual information of all hidden layers within the deep learning model. The indicator represents the predictive capabilities of the neural network. The evolution of the indicator provides another level of analysis with regards to the generalization capabilities. By using information theory, we can express the flow of information within a previously unseen black box. The framework provides the capability to analyse a deep learning model. It is a conceptual platform where users can perform analysis using the functions provided. Functions include computing mutual information, indicator usage, and visualization. Experiments were conducted using different methods to compute mutual information and its effects on deep learning models. It is found that our method based on the indicator overcome the shortcomings of using non-linear information bottleneck objective function. Our method computes the average mutual information of all hidden layers and this produces a better estimate as compared to the objective function which computes the mutual information of an intermediate representation. The advantage of the proposed framework includes that it is not restricted to a certain kind of neural network. Furthermore, it uses probability distribution functions which means the framework does not rely on the presence of the actual dataset. The focus of this paper is the use rather than the computation of mutual information. Therefore, methods such as non-linear information bottleneck or a neural network trained to estimate the mutual information of given datasets can be used to compute mutual information. The framework also provides a general solution to observe the learning process of a neural network and it is available at a public repository Bachelor of Engineering (Computer Science) 2020-04-18T03:35:27Z 2020-04-18T03:35:27Z 2020 Final Year Project (FYP) https://hdl.handle.net/10356/137910 en SCSE19-0092 application/pdf Nanyang Technological University
institution	Nanyang Technological University
building	NTU Library
country	Singapore
collection	DR-NTU
language	English
topic	Engineering::Computer science and engineering::Information systems
spellingShingle	Engineering::Computer science and engineering::Information systems Kan, Shawn Jung Tze Using mutual information to evaluate the generalization capability of deep learning neural networks
description	There is a need to better understand how generalization works in a deep learning model. The goal of this paper is to provide a clearer view of the black box called neural network. This is done by using information theory to compute the flow of information within a network. The proposed framework uses an indicator that computes the mutual information of all hidden layers within the deep learning model. The indicator represents the predictive capabilities of the neural network. The evolution of the indicator provides another level of analysis with regards to the generalization capabilities. By using information theory, we can express the flow of information within a previously unseen black box. The framework provides the capability to analyse a deep learning model. It is a conceptual platform where users can perform analysis using the functions provided. Functions include computing mutual information, indicator usage, and visualization. Experiments were conducted using different methods to compute mutual information and its effects on deep learning models. It is found that our method based on the indicator overcome the shortcomings of using non-linear information bottleneck objective function. Our method computes the average mutual information of all hidden layers and this produces a better estimate as compared to the objective function which computes the mutual information of an intermediate representation. The advantage of the proposed framework includes that it is not restricted to a certain kind of neural network. Furthermore, it uses probability distribution functions which means the framework does not rely on the presence of the actual dataset. The focus of this paper is the use rather than the computation of mutual information. Therefore, methods such as non-linear information bottleneck or a neural network trained to estimate the mutual information of given datasets can be used to compute mutual information. The framework also provides a general solution to observe the learning process of a neural network and it is available at a public repository
author2	Althea Liang
author_facet	Althea Liang Kan, Shawn Jung Tze
format	Final Year Project
author	Kan, Shawn Jung Tze
author_sort	Kan, Shawn Jung Tze
title	Using mutual information to evaluate the generalization capability of deep learning neural networks
title_short	Using mutual information to evaluate the generalization capability of deep learning neural networks
title_full	Using mutual information to evaluate the generalization capability of deep learning neural networks
title_fullStr	Using mutual information to evaluate the generalization capability of deep learning neural networks
title_full_unstemmed	Using mutual information to evaluate the generalization capability of deep learning neural networks
title_sort	using mutual information to evaluate the generalization capability of deep learning neural networks
publisher	Nanyang Technological University
publishDate	2020
url	https://hdl.handle.net/10356/137910
_version_	1681057256794226688

Using mutual information to evaluate the generalization capability of deep learning neural networks

Similar Items