A layer-wise theoretical framework for deep learning of convolutional neural networks

As research attention in deep learning has been focusing on pushing empirical results to a higher peak, remarkable progress has been made in the performance race of machine learning applications in the past years. Yet deep learning based on artificial neural networks still remains difficult to under...

Full description

Saved in:
Bibliographic Details
Main Authors: Nguyen, Huu-Thiet, Li, Sitan, Cheah, Chien Chern
Other Authors: School of Electrical and Electronic Engineering
Format: Article
Language:English
Published: 2022
Subjects:
Online Access:https://hdl.handle.net/10356/155225
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:As research attention in deep learning has been focusing on pushing empirical results to a higher peak, remarkable progress has been made in the performance race of machine learning applications in the past years. Yet deep learning based on artificial neural networks still remains difficult to understand as it is considered as a black-box approach. A lack of understanding of deep learning networks from the theoretical perspective would not only hinder the employment of them in applications where high-stakes decisions need to be made, but also limit their future development where artificial intelligence is expected to be robust, predictable and trustable. This paper aims to provide a theoretical methodology to investigate and train deep convolutional neural networks so as to ensure convergence. A mathematical model based on matrix representations for convolutional neural networks is first formulated and an analytic layer-wise learning framework for convolutional neural networks is then proposed and tested on several common benchmarking image datasets. The case studies show a reasonable trade-off between accuracy and analytic learning, and also highlight the potential of employing the proposed layer-wise learning method in finding the appropriate number of layers in actual implementations.