Hardware modeling development of a convolutional neural network with K-means-clustered weights in rapid prototyping systems: Advances and limitations

Neural networks and clustering are two of the many machine learning algorithms used for artificial intelligence. The conventional neural network is made up of numerous fully connected layers of neutrons. On the other hand, Convolutional Neural Networks (CNN) have become a better alternative to the c...

Full description

Saved in:
Bibliographic Details
Main Author: Yap, Roderick Y.
Format: text
Language:English
Published: Animo Repository 2019
Subjects:
Online Access:https://animorepository.dlsu.edu.ph/etd_doctoral/1459
https://animorepository.dlsu.edu.ph/context/etd_doctoral/article/2514/viewcontent/Yap__Roderick_Y._disertation_2.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: De La Salle University
Language: English
Description
Summary:Neural networks and clustering are two of the many machine learning algorithms used for artificial intelligence. The conventional neural network is made up of numerous fully connected layers of neutrons. On the other hand, Convolutional Neural Networks (CNN) have become a better alternative to the conventional neural network due its ability to provide better guarantee for training success. In designing a hardware model for the CNN, emphasis is not only focused hardware requirement for the size and number of processing layers but also to the those needed by the weight values. In this research, a hardware model design for a CNN architecture is presented. The hardware model is capable of training by itself without the aid of any external or co processor. A hardware model design for the K-means clustering algorithm is also presented. The K-means clustering model is intended to compress the weights of the CNN in order to save hardware requirement for implementation. The CNN model and the K-means clustering model are then integrated to develop a CNN architecture that can perform weight compression by itself after training. The two hardware models are synthesized and implemented using a XILINX Virtex 5 library. Small scale CNN for pattern recognition shows the CNN can still recognize the input patterns at a compression rate of up to 80%. Another small scale CNN for selected digit image recognition shows 100% recognition of trained inputs up to 60% compression. The integration, when synthesized using the Virtex 5 library consumes 29,163 slice registers, 28,896 flip flops and 55,645 look up tables.