Gaze estimation using residual neural network

The eye is a vital source of visual information for emotion, focus and cognitive processes. Tracking the eyes has proved to be an important tool for researches in multiple fields. However, the appearance of the eye is sensitive to large number of variables such as light conditions, head pose, viewin...

Full description

Saved in:
Bibliographic Details
Main Author: Wong, En Teng
Other Authors: Lee Bu Sung, Francis
Format: Final Year Project
Language:English
Published: 2018
Subjects:
Online Access:http://hdl.handle.net/10356/76160
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-76160
record_format dspace
spelling sg-ntu-dr.10356-761602023-03-03T20:37:07Z Gaze estimation using residual neural network Wong, En Teng Lee Bu Sung, Francis School of Computer Science and Engineering DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence The eye is a vital source of visual information for emotion, focus and cognitive processes. Tracking the eyes has proved to be an important tool for researches in multiple fields. However, the appearance of the eye is sensitive to large number of variables such as light conditions, head pose, viewing angle, openness and size of the eye, etc. With the emergence of deep learning, many researches dive into using deep learning as an approach for gaze estimation. This paper explored the use of Residual Neural Network (ResNet-18) to predict eye gaze using a massive public dataset called GazeCapture. ResNet-18 is a model developed by Microsoft Research Asia and is the winner of the annual ImageNet Large Scale Visual Recognition Challenge (ILSVRC) in 2015, achieving a low error rate of 3.5%. GazeCapture is a large-scale eye tracking dataset collected through crowd-sourcing using Amazon Mechanical Turk. GazeCapture offers scalability and high degree of variation, which sets it apart from other large public datasets. This paper also analysed the improvements made by preprocesses such as a) removing incorrect data, b) methods of normalisation, c) extracting features like Euler’s angles for head pose and d) using face grids. From the experiments, it is concluded that ResNet-18 achieved lower errors than iTracker which used AlexNet as part of its model architecture. Applying histogram normalisation and removing incorrect data has also helped in reducing the errors. Furthermore, introducing Euler’s angles were not useful in reducing the errors due to its narrow distribution. Bachelor of Engineering (Computer Science) 2018-11-21T14:07:49Z 2018-11-21T14:07:49Z 2018 Final Year Project (FYP) http://hdl.handle.net/10356/76160 en Nanyang Technological University 38 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
spellingShingle DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Wong, En Teng
Gaze estimation using residual neural network
description The eye is a vital source of visual information for emotion, focus and cognitive processes. Tracking the eyes has proved to be an important tool for researches in multiple fields. However, the appearance of the eye is sensitive to large number of variables such as light conditions, head pose, viewing angle, openness and size of the eye, etc. With the emergence of deep learning, many researches dive into using deep learning as an approach for gaze estimation. This paper explored the use of Residual Neural Network (ResNet-18) to predict eye gaze using a massive public dataset called GazeCapture. ResNet-18 is a model developed by Microsoft Research Asia and is the winner of the annual ImageNet Large Scale Visual Recognition Challenge (ILSVRC) in 2015, achieving a low error rate of 3.5%. GazeCapture is a large-scale eye tracking dataset collected through crowd-sourcing using Amazon Mechanical Turk. GazeCapture offers scalability and high degree of variation, which sets it apart from other large public datasets. This paper also analysed the improvements made by preprocesses such as a) removing incorrect data, b) methods of normalisation, c) extracting features like Euler’s angles for head pose and d) using face grids. From the experiments, it is concluded that ResNet-18 achieved lower errors than iTracker which used AlexNet as part of its model architecture. Applying histogram normalisation and removing incorrect data has also helped in reducing the errors. Furthermore, introducing Euler’s angles were not useful in reducing the errors due to its narrow distribution.
author2 Lee Bu Sung, Francis
author_facet Lee Bu Sung, Francis
Wong, En Teng
format Final Year Project
author Wong, En Teng
author_sort Wong, En Teng
title Gaze estimation using residual neural network
title_short Gaze estimation using residual neural network
title_full Gaze estimation using residual neural network
title_fullStr Gaze estimation using residual neural network
title_full_unstemmed Gaze estimation using residual neural network
title_sort gaze estimation using residual neural network
publishDate 2018
url http://hdl.handle.net/10356/76160
_version_ 1759856784511598592