Gaze estimation using residual neural network
The eye is a vital source of visual information for emotion, focus and cognitive processes. Tracking the eyes has proved to be an important tool for researches in multiple fields. However, the appearance of the eye is sensitive to large number of variables such as light conditions, head pose, viewin...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
2018
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/76160 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-76160 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-761602023-03-03T20:37:07Z Gaze estimation using residual neural network Wong, En Teng Lee Bu Sung, Francis School of Computer Science and Engineering DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence The eye is a vital source of visual information for emotion, focus and cognitive processes. Tracking the eyes has proved to be an important tool for researches in multiple fields. However, the appearance of the eye is sensitive to large number of variables such as light conditions, head pose, viewing angle, openness and size of the eye, etc. With the emergence of deep learning, many researches dive into using deep learning as an approach for gaze estimation. This paper explored the use of Residual Neural Network (ResNet-18) to predict eye gaze using a massive public dataset called GazeCapture. ResNet-18 is a model developed by Microsoft Research Asia and is the winner of the annual ImageNet Large Scale Visual Recognition Challenge (ILSVRC) in 2015, achieving a low error rate of 3.5%. GazeCapture is a large-scale eye tracking dataset collected through crowd-sourcing using Amazon Mechanical Turk. GazeCapture offers scalability and high degree of variation, which sets it apart from other large public datasets. This paper also analysed the improvements made by preprocesses such as a) removing incorrect data, b) methods of normalisation, c) extracting features like Euler’s angles for head pose and d) using face grids. From the experiments, it is concluded that ResNet-18 achieved lower errors than iTracker which used AlexNet as part of its model architecture. Applying histogram normalisation and removing incorrect data has also helped in reducing the errors. Furthermore, introducing Euler’s angles were not useful in reducing the errors due to its narrow distribution. Bachelor of Engineering (Computer Science) 2018-11-21T14:07:49Z 2018-11-21T14:07:49Z 2018 Final Year Project (FYP) http://hdl.handle.net/10356/76160 en Nanyang Technological University 38 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence |
spellingShingle |
DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Wong, En Teng Gaze estimation using residual neural network |
description |
The eye is a vital source of visual information for emotion, focus and cognitive processes. Tracking the eyes has proved to be an important tool for researches in multiple fields. However, the appearance of the eye is sensitive to large number of variables such as light conditions, head pose, viewing angle, openness and size of the eye, etc. With the emergence of deep learning, many researches dive into using deep learning as an approach for gaze estimation.
This paper explored the use of Residual Neural Network (ResNet-18) to predict eye gaze using a massive public dataset called GazeCapture. ResNet-18 is a model developed by Microsoft Research Asia and is the winner of the annual ImageNet Large Scale Visual Recognition Challenge (ILSVRC) in 2015, achieving a low error rate of 3.5%. GazeCapture is a large-scale eye tracking dataset collected through crowd-sourcing using Amazon Mechanical Turk. GazeCapture offers scalability and high degree of variation, which sets it apart from other large public datasets.
This paper also analysed the improvements made by preprocesses such as a) removing incorrect data, b) methods of normalisation, c) extracting features like Euler’s angles for head pose and d) using face grids.
From the experiments, it is concluded that ResNet-18 achieved lower errors than iTracker which used AlexNet as part of its model architecture. Applying histogram normalisation and removing incorrect data has also helped in reducing the errors. Furthermore, introducing Euler’s angles were not useful in reducing the errors due to its narrow distribution. |
author2 |
Lee Bu Sung, Francis |
author_facet |
Lee Bu Sung, Francis Wong, En Teng |
format |
Final Year Project |
author |
Wong, En Teng |
author_sort |
Wong, En Teng |
title |
Gaze estimation using residual neural network |
title_short |
Gaze estimation using residual neural network |
title_full |
Gaze estimation using residual neural network |
title_fullStr |
Gaze estimation using residual neural network |
title_full_unstemmed |
Gaze estimation using residual neural network |
title_sort |
gaze estimation using residual neural network |
publishDate |
2018 |
url |
http://hdl.handle.net/10356/76160 |
_version_ |
1759856784511598592 |