The BeMi Stardust: A structured ensemble of Binarized Neural Networks
Binarized Neural Networks (BNNs) are receiving increasing attention due to their lightweight architecture and ability to run on low-power devices, given the fact that they can be implemented using Boolean operations. The state-of-the-art for training classification BNNs restricted to few-shot learni...
Saved in:
Main Authors: | , , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2023
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/8310 https://ink.library.smu.edu.sg/context/sis_research/article/9313/viewcontent/BeMiStardust_av.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
Summary: | Binarized Neural Networks (BNNs) are receiving increasing attention due to their lightweight architecture and ability to run on low-power devices, given the fact that they can be implemented using Boolean operations. The state-of-the-art for training classification BNNs restricted to few-shot learning is based on a Mixed Integer Programming (MIP) approach. This paper proposes the BeMi ensemble, a structured architecture of classification-designed BNNs based on training a single BNN for each possible pair of classes and applying a majority voting scheme to predict the final output. The training of a single BNN discriminating between two classes is achieved by a MIP model that optimizes a lexicographic multi-objective function according to robustness and simplicity principles. This approach results in training networks whose output is not affected by small perturbations on the input and whose number of active weights is as small as possible, while good accuracy is preserved. We computationally validate our model using the MNIST and Fashion-MNIST datasets using up to 40 training images per class. Our structured ensemble outperforms both BNNs trained by stochastic gradient descent and state-of-the-art MIP-based approaches. While the previous approaches achieve an average accuracy of on the MNIST dataset, the BeMi ensemble achieves an average accuracy of when trained with 10 images per class and when trained with 40 images per class. |
---|