Assessing rotational invariance of graph convolution neural networks for computer vision

This project aims to assess the property of rotational invariance within graph convolution neural networks (graph CNNs) for learning on images. Standard CNNs possess the property of translational invariance due to the sliding nature of the convolution operation and rotational invariance for small an...

Full description

Saved in:
Bibliographic Details
Main Author: Singh, Priyanshu Kumar
Other Authors: Xavier Bresson
Format: Final Year Project
Language:English
Published: 2018
Subjects:
Online Access:http://hdl.handle.net/10356/74246
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:This project aims to assess the property of rotational invariance within graph convolution neural networks (graph CNNs) for learning on images. Standard CNNs possess the property of translational invariance due to the sliding nature of the convolution operation and rotational invariance for small angles, due to the pooling operation. But there is still a need for a CNN model which is able to learn truly rotationally invariant filters so that it is able to perform well on image datasets where the images preserve their identity irrespective of their orientation. Graphs are known to be isotropic in nature and hence, their edges do not have a sense of direction. This project leverages that property to prove that if images are converted into grid graphs and then graph CNNs are used to learn spectral filters over them, then those filters are rotationally invariant. A couple of architectures based on Lenet-5 and VGG16 respectively, were used to conduct comparative experiments between graph CNNs and standard CNNs and it was found that graph CNNs indeed showed much more invariance to rotation than the standard CNNs. The experiments were conducted on the MNIST and CIFAR-10 datasets and it was observed that graph CNNs were able to learn rotationally invariant spectral filters for both the datasets. Furthermore, it was seen that deeper architectures were able to encode more rotational invariance within their learned filters as compared to shallower ones.