Deep learning techniques to derive descriptions from audio signals
With the rapid growth of the Internet, the amount of video and audio data is increasing sharply. With the development of big data and artificial intelligence, audio analysis and recognition technology become more important. As the audio classification requirement increases, to classify audio and ge...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2020
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/138858 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-138858 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1388582020-05-13T06:47:47Z Deep learning techniques to derive descriptions from audio signals Wu, Mengkai Jagath C Rajapakse School of Computer Science and Engineering ASJagath@ntu.edu.sg Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence With the rapid growth of the Internet, the amount of video and audio data is increasing sharply. With the development of big data and artificial intelligence, audio analysis and recognition technology become more important. As the audio classification requirement increases, to classify audio and generate a description, many methods have been introduced. This project uses machine learning to achieve the classification goal through building a model with Convolutional Neural Networks or other neural networks such as Recurrent Neural Networks to categorize and generate the description for the audio. This paper includes the research I have done for generating audio descriptions using different neural network models and approaches. It starts from audio data downloading, feature extraction, image generation, and classifier training to the final audio description design and implementation. In this project, after comparison on a few types of deep neural networks, we found that deep convolutional neural networks have the overall better accuracy. Bachelor of Engineering (Computer Science) 2020-05-13T06:47:46Z 2020-05-13T06:47:46Z 2020 Final Year Project (FYP) https://hdl.handle.net/10356/138858 en PSCSE18-0064 application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
country |
Singapore |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence |
spellingShingle |
Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Wu, Mengkai Deep learning techniques to derive descriptions from audio signals |
description |
With the rapid growth of the Internet, the amount of video and audio data is increasing sharply. With the development of big data and artificial intelligence, audio analysis and recognition technology become more important.
As the audio classification requirement increases, to classify audio and generate a description, many methods have been introduced. This project uses machine learning to achieve the classification goal through building a model with Convolutional Neural Networks or other neural networks such as Recurrent Neural Networks to categorize and generate the description for the audio.
This paper includes the research I have done for generating audio descriptions using different neural network models and approaches. It starts from audio data downloading, feature extraction, image generation, and classifier training to the final audio description design and implementation. In this project, after comparison on a few types of deep neural networks, we found that deep convolutional neural networks have the overall better accuracy. |
author2 |
Jagath C Rajapakse |
author_facet |
Jagath C Rajapakse Wu, Mengkai |
format |
Final Year Project |
author |
Wu, Mengkai |
author_sort |
Wu, Mengkai |
title |
Deep learning techniques to derive descriptions from audio signals |
title_short |
Deep learning techniques to derive descriptions from audio signals |
title_full |
Deep learning techniques to derive descriptions from audio signals |
title_fullStr |
Deep learning techniques to derive descriptions from audio signals |
title_full_unstemmed |
Deep learning techniques to derive descriptions from audio signals |
title_sort |
deep learning techniques to derive descriptions from audio signals |
publisher |
Nanyang Technological University |
publishDate |
2020 |
url |
https://hdl.handle.net/10356/138858 |
_version_ |
1681057745919279104 |