Gaze estimation using residual neural network

The eye is a vital source of visual information for emotion, focus and cognitive processes. Tracking the eyes has proved to be an important tool for researches in multiple fields. However, the appearance of the eye is sensitive to large number of variables such as light conditions, head pose, viewin...

Full description

Saved in:

Bibliographic Details
Main Author:	Wong, En Teng
Other Authors:	Lee Bu Sung, Francis
Format:	Final Year Project
Language:	English
Published:	2018
Subjects:	DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Online Access:	http://hdl.handle.net/10356/76160
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-76160
record_format	dspace
spelling	sg-ntu-dr.10356-761602023-03-03T20:37:07Z Gaze estimation using residual neural network Wong, En Teng Lee Bu Sung, Francis School of Computer Science and Engineering DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence The eye is a vital source of visual information for emotion, focus and cognitive processes. Tracking the eyes has proved to be an important tool for researches in multiple fields. However, the appearance of the eye is sensitive to large number of variables such as light conditions, head pose, viewing angle, openness and size of the eye, etc. With the emergence of deep learning, many researches dive into using deep learning as an approach for gaze estimation. This paper explored the use of Residual Neural Network (ResNet-18) to predict eye gaze using a massive public dataset called GazeCapture. ResNet-18 is a model developed by Microsoft Research Asia and is the winner of the annual ImageNet Large Scale Visual Recognition Challenge (ILSVRC) in 2015, achieving a low error rate of 3.5%. GazeCapture is a large-scale eye tracking dataset collected through crowd-sourcing using Amazon Mechanical Turk. GazeCapture offers scalability and high degree of variation, which sets it apart from other large public datasets. This paper also analysed the improvements made by preprocesses such as a) removing incorrect data, b) methods of normalisation, c) extracting features like Euler’s angles for head pose and d) using face grids. From the experiments, it is concluded that ResNet-18 achieved lower errors than iTracker which used AlexNet as part of its model architecture. Applying histogram normalisation and removing incorrect data has also helped in reducing the errors. Furthermore, introducing Euler’s angles were not useful in reducing the errors due to its narrow distribution. Bachelor of Engineering (Computer Science) 2018-11-21T14:07:49Z 2018-11-21T14:07:49Z 2018 Final Year Project (FYP) http://hdl.handle.net/10356/76160 en Nanyang Technological University 38 p. application/pdf
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
spellingShingle	DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Wong, En Teng Gaze estimation using residual neural network
description	The eye is a vital source of visual information for emotion, focus and cognitive processes. Tracking the eyes has proved to be an important tool for researches in multiple fields. However, the appearance of the eye is sensitive to large number of variables such as light conditions, head pose, viewing angle, openness and size of the eye, etc. With the emergence of deep learning, many researches dive into using deep learning as an approach for gaze estimation. This paper explored the use of Residual Neural Network (ResNet-18) to predict eye gaze using a massive public dataset called GazeCapture. ResNet-18 is a model developed by Microsoft Research Asia and is the winner of the annual ImageNet Large Scale Visual Recognition Challenge (ILSVRC) in 2015, achieving a low error rate of 3.5%. GazeCapture is a large-scale eye tracking dataset collected through crowd-sourcing using Amazon Mechanical Turk. GazeCapture offers scalability and high degree of variation, which sets it apart from other large public datasets. This paper also analysed the improvements made by preprocesses such as a) removing incorrect data, b) methods of normalisation, c) extracting features like Euler’s angles for head pose and d) using face grids. From the experiments, it is concluded that ResNet-18 achieved lower errors than iTracker which used AlexNet as part of its model architecture. Applying histogram normalisation and removing incorrect data has also helped in reducing the errors. Furthermore, introducing Euler’s angles were not useful in reducing the errors due to its narrow distribution.
author2	Lee Bu Sung, Francis
author_facet	Lee Bu Sung, Francis Wong, En Teng
format	Final Year Project
author	Wong, En Teng
author_sort	Wong, En Teng
title	Gaze estimation using residual neural network
title_short	Gaze estimation using residual neural network
title_full	Gaze estimation using residual neural network
title_fullStr	Gaze estimation using residual neural network
title_full_unstemmed	Gaze estimation using residual neural network
title_sort	gaze estimation using residual neural network
publishDate	2018
url	http://hdl.handle.net/10356/76160
_version_	1759856784511598592

Gaze estimation using residual neural network

Similar Items