Speech recognition based on spectrograms by using deep learning

Speech Recognition is widely being used and it has become part of our day to day. Several massive and popular applications have taken its use to another level. Most of the existing systems use machine learning techniques such as artificial neural networks or fuzzy logic, whereas others may just be b...

Full description

Saved in:

Bibliographic Details
Main Author:	Leon, Roy Eduardo Aguilar
Format:	Thesis
Language:	English
Published:	2018
Subjects:	TK Electrical engineering. Electronics Nuclear engineering
Online Access:	http://eprints.utm.my/id/eprint/79538/1/RoyEduardoAguilaMFKE2018.pdf http://eprints.utm.my/id/eprint/79538/
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Universiti Teknologi Malaysia
Language:	English

id	my.utm.79538
record_format	eprints
spelling	my.utm.795382018-10-31T12:54:50Z http://eprints.utm.my/id/eprint/79538/ Speech recognition based on spectrograms by using deep learning Leon, Roy Eduardo Aguilar TK Electrical engineering. Electronics Nuclear engineering Speech Recognition is widely being used and it has become part of our day to day. Several massive and popular applications have taken its use to another level. Most of the existing systems use machine learning techniques such as artificial neural networks or fuzzy logic, whereas others may just be based in a comparative analysis of the sound signals with a large lookup tables that contain possible realizations of voice commands. These models base their speech recognition algorithms on the analysis or comparison of the analog acoustic signal itself. The sound has particular characteristics that can not be seen through the representation of its propagation wave in time. This project proposes speech recognition through an innovative model that analyzes the graphic representation of the acustic signal, its spectrogram. Therefore the model does not classify the speech through its acoustic signal but its graphical representation. This leads the research to an approximation of the problem through the use of image classification techniques. Image clasification was considered a task only the humans can do, with the devoloping of machine learning techniques this perception has drastically changed. This project covers several techniques and shows the potential of Deep Learning for objects classification and within this field presents the convolutional neural networks as the most suitable algorithim for the classifcation of spectrograms. As a method to clearly illustrate the efficacy of the proposed model, the used alorithim was trained with two self-obtained datasets. Several experiments were conducted to make a detailed comparison of the system throughput and its levels of accuracy. 2018 Thesis NonPeerReviewed application/pdf en http://eprints.utm.my/id/eprint/79538/1/RoyEduardoAguilaMFKE2018.pdf Leon, Roy Eduardo Aguilar (2018) Speech recognition based on spectrograms by using deep learning. Masters thesis, Universiti Teknologi Malaysia, Faculty of Electrical Engineering.
institution	Universiti Teknologi Malaysia
building	UTM Library
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Teknologi Malaysia
content_source	UTM Institutional Repository
url_provider	http://eprints.utm.my/
language	English
topic	TK Electrical engineering. Electronics Nuclear engineering
spellingShingle	TK Electrical engineering. Electronics Nuclear engineering Leon, Roy Eduardo Aguilar Speech recognition based on spectrograms by using deep learning
description	Speech Recognition is widely being used and it has become part of our day to day. Several massive and popular applications have taken its use to another level. Most of the existing systems use machine learning techniques such as artificial neural networks or fuzzy logic, whereas others may just be based in a comparative analysis of the sound signals with a large lookup tables that contain possible realizations of voice commands. These models base their speech recognition algorithms on the analysis or comparison of the analog acoustic signal itself. The sound has particular characteristics that can not be seen through the representation of its propagation wave in time. This project proposes speech recognition through an innovative model that analyzes the graphic representation of the acustic signal, its spectrogram. Therefore the model does not classify the speech through its acoustic signal but its graphical representation. This leads the research to an approximation of the problem through the use of image classification techniques. Image clasification was considered a task only the humans can do, with the devoloping of machine learning techniques this perception has drastically changed. This project covers several techniques and shows the potential of Deep Learning for objects classification and within this field presents the convolutional neural networks as the most suitable algorithim for the classifcation of spectrograms. As a method to clearly illustrate the efficacy of the proposed model, the used alorithim was trained with two self-obtained datasets. Several experiments were conducted to make a detailed comparison of the system throughput and its levels of accuracy.
format	Thesis
author	Leon, Roy Eduardo Aguilar
author_facet	Leon, Roy Eduardo Aguilar
author_sort	Leon, Roy Eduardo Aguilar
title	Speech recognition based on spectrograms by using deep learning
title_short	Speech recognition based on spectrograms by using deep learning
title_full	Speech recognition based on spectrograms by using deep learning
title_fullStr	Speech recognition based on spectrograms by using deep learning
title_full_unstemmed	Speech recognition based on spectrograms by using deep learning
title_sort	speech recognition based on spectrograms by using deep learning
publishDate	2018
url	http://eprints.utm.my/id/eprint/79538/1/RoyEduardoAguilaMFKE2018.pdf http://eprints.utm.my/id/eprint/79538/
_version_	1643658223890202624

Speech recognition based on spectrograms by using deep learning

Similar Items