Machine learning for lip reading

Lip-reading is one of the most challenging task in visual recognition system. It decodes the text from the movement of lips from the speaker. In the previous approach, the lip-reading problem is divided by two stages: feature extraction and prediction. The Hidden Markov Model is implemented to solve...

Full description

Saved in:
Bibliographic Details
Main Author: Zhao, Han
Other Authors: Andy Khong Wai Hoong
Format: Final Year Project
Language:English
Published: 2018
Subjects:
Online Access:http://hdl.handle.net/10356/74671
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-74671
record_format dspace
spelling sg-ntu-dr.10356-746712023-07-07T17:34:42Z Machine learning for lip reading Zhao, Han Andy Khong Wai Hoong School of Electrical and Electronic Engineering Centre for Signal Processing DRNTU::Engineering::Electrical and electronic engineering::Computer hardware, software and systems Lip-reading is one of the most challenging task in visual recognition system. It decodes the text from the movement of lips from the speaker. In the previous approach, the lip-reading problem is divided by two stages: feature extraction and prediction. The Hidden Markov Model is implemented to solve the sequence problem. However, the traditional approaches require a lot of effort on feature extraction. Also, the models are trained to perform single word classification instead of sentence-level. This project aims to build an end-to-end sentence level system of lip-reading, by using the neural network and deep learning method. The convolutional neural network(CNN), recurrent neural network (RNN) and connectionist temporal classification (CTC) method will be implemented on the neural network. The GRID dataset is used in this project. Several speech videos from the GRID dataset will be used as training data.   Bachelor of Engineering 2018-05-23T02:00:17Z 2018-05-23T02:00:17Z 2018 Final Year Project (FYP) http://hdl.handle.net/10356/74671 en Nanyang Technological University 55 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Electrical and electronic engineering::Computer hardware, software and systems
spellingShingle DRNTU::Engineering::Electrical and electronic engineering::Computer hardware, software and systems
Zhao, Han
Machine learning for lip reading
description Lip-reading is one of the most challenging task in visual recognition system. It decodes the text from the movement of lips from the speaker. In the previous approach, the lip-reading problem is divided by two stages: feature extraction and prediction. The Hidden Markov Model is implemented to solve the sequence problem. However, the traditional approaches require a lot of effort on feature extraction. Also, the models are trained to perform single word classification instead of sentence-level. This project aims to build an end-to-end sentence level system of lip-reading, by using the neural network and deep learning method. The convolutional neural network(CNN), recurrent neural network (RNN) and connectionist temporal classification (CTC) method will be implemented on the neural network. The GRID dataset is used in this project. Several speech videos from the GRID dataset will be used as training data.  
author2 Andy Khong Wai Hoong
author_facet Andy Khong Wai Hoong
Zhao, Han
format Final Year Project
author Zhao, Han
author_sort Zhao, Han
title Machine learning for lip reading
title_short Machine learning for lip reading
title_full Machine learning for lip reading
title_fullStr Machine learning for lip reading
title_full_unstemmed Machine learning for lip reading
title_sort machine learning for lip reading
publishDate 2018
url http://hdl.handle.net/10356/74671
_version_ 1772825160281227264