Assessing rotational invariance of graph convolution neural networks for computer vision
This project aims to assess the property of rotational invariance within graph convolution neural networks (graph CNNs) for learning on images. Standard CNNs possess the property of translational invariance due to the sliding nature of the convolution operation and rotational invariance for small an...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
2018
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/74246 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | This project aims to assess the property of rotational invariance within graph convolution neural networks (graph CNNs) for learning on images. Standard CNNs possess the property of translational invariance due to the sliding nature of the convolution operation and rotational invariance for small angles, due to the pooling operation. But there is still a need for a CNN model which is able to learn truly rotationally invariant filters so that it is able to perform well on image datasets where the images preserve their identity irrespective of their orientation. Graphs are known to be isotropic in nature and hence, their edges do not have a sense of direction. This project leverages that property to prove that if images are converted into grid graphs and then graph CNNs are used to learn spectral filters over them, then those filters are rotationally invariant. A couple of architectures based on Lenet-5 and VGG16 respectively, were used to conduct comparative experiments between graph CNNs and standard CNNs and it was found that graph CNNs indeed showed much more invariance to rotation than the standard CNNs. The experiments were conducted on the MNIST and CIFAR-10 datasets and it was observed that graph CNNs were able to learn rotationally invariant spectral filters for both the datasets. Furthermore, it was seen that deeper architectures were able to encode more rotational invariance within their learned filters as compared to shallower ones. |
---|