Urban sound analysis and synthesis using artificial intelligence

With the advent of artificial intelligence and machine learning, multiple industries have gone through different kinds of revolution. For example, convolutional neural networks has drastically changed the conventional ways for computer to capture features of image and video also known as computer vi...

Full description

Saved in:
Bibliographic Details
Main Author: Guo, Zixun
Other Authors: Gan Woon Seng
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2020
Subjects:
Online Access:https://hdl.handle.net/10356/141355
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-141355
record_format dspace
spelling sg-ntu-dr.10356-1413552023-07-07T18:38:57Z Urban sound analysis and synthesis using artificial intelligence Guo, Zixun Gan Woon Seng School of Electrical and Electronic Engineering Smart Nation TRANS Lab Information Communication Institute of Singapore Furi Andi Karnapi EWSGAN@ntu.edu.sg, furi@ntu.edu.sg Engineering::Electrical and electronic engineering With the advent of artificial intelligence and machine learning, multiple industries have gone through different kinds of revolution. For example, convolutional neural networks has drastically changed the conventional ways for computer to capture features of image and video also known as computer vision. In the audio domain, artificial intelligence has been widely used in areas such as sound classification, speech to text conversion etc. In this work, I will mainly focus on the use of artificial intelligence in urban sound analysis and processing which was shown to have much better performance than conventional methods. Unlike images or videos, analog sound has to be sampled and quantized in order to be stored in digital format. In this work, only digital sound is concerned since neural networks can only pick up digital values. Digital sound also has its unique sets of features such as sampling frequency, bit depth. Various research work has also utilized sound features in the frequency domain such as bandwidth. One important feature of digital sound, sampling frequency, is normally beyond 8kHz. This would bring up some issues in audio processing since one second of audio would contain at least thousands of discrete digital values. In order to process large amounts of sound samples in a sequential manner, the focus of this work will be on recurrent neural networks, a type of network structure with its own memory mechanism that can deal with long-term dependency. In this work I will focus on two topics: audio captioning and audio synthesis. Firstly, captioning using AI has been widely used in the field of computer vision. Meanwhile, audio captioning would be useful for those people who may have hearing issues to perceive sound information. Secondly, audio data collection could be time-consuming and costly. However by learning audio patterns and inter-dependencies, sound synthesis would generate sound more efficiently. Bachelor of Engineering (Electrical and Electronic Engineering) 2020-06-08T02:17:26Z 2020-06-08T02:17:26Z 2020 Final Year Project (FYP) https://hdl.handle.net/10356/141355 en A3090-191 application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Electrical and electronic engineering
spellingShingle Engineering::Electrical and electronic engineering
Guo, Zixun
Urban sound analysis and synthesis using artificial intelligence
description With the advent of artificial intelligence and machine learning, multiple industries have gone through different kinds of revolution. For example, convolutional neural networks has drastically changed the conventional ways for computer to capture features of image and video also known as computer vision. In the audio domain, artificial intelligence has been widely used in areas such as sound classification, speech to text conversion etc. In this work, I will mainly focus on the use of artificial intelligence in urban sound analysis and processing which was shown to have much better performance than conventional methods. Unlike images or videos, analog sound has to be sampled and quantized in order to be stored in digital format. In this work, only digital sound is concerned since neural networks can only pick up digital values. Digital sound also has its unique sets of features such as sampling frequency, bit depth. Various research work has also utilized sound features in the frequency domain such as bandwidth. One important feature of digital sound, sampling frequency, is normally beyond 8kHz. This would bring up some issues in audio processing since one second of audio would contain at least thousands of discrete digital values. In order to process large amounts of sound samples in a sequential manner, the focus of this work will be on recurrent neural networks, a type of network structure with its own memory mechanism that can deal with long-term dependency. In this work I will focus on two topics: audio captioning and audio synthesis. Firstly, captioning using AI has been widely used in the field of computer vision. Meanwhile, audio captioning would be useful for those people who may have hearing issues to perceive sound information. Secondly, audio data collection could be time-consuming and costly. However by learning audio patterns and inter-dependencies, sound synthesis would generate sound more efficiently.
author2 Gan Woon Seng
author_facet Gan Woon Seng
Guo, Zixun
format Final Year Project
author Guo, Zixun
author_sort Guo, Zixun
title Urban sound analysis and synthesis using artificial intelligence
title_short Urban sound analysis and synthesis using artificial intelligence
title_full Urban sound analysis and synthesis using artificial intelligence
title_fullStr Urban sound analysis and synthesis using artificial intelligence
title_full_unstemmed Urban sound analysis and synthesis using artificial intelligence
title_sort urban sound analysis and synthesis using artificial intelligence
publisher Nanyang Technological University
publishDate 2020
url https://hdl.handle.net/10356/141355
_version_ 1772825764953063424