Acoustic and video processing demonstration system

An acoustic and video processing demonstration system was developed. Direction of Arrival (DOA) estimation or sound source locations were overlaid on a video stream of the application, in real time. The application will implement the Steered Response Power (SRP) with Phase Transform (PHAT) or SRP-PH...

Full description

Saved in:
Bibliographic Details
Main Author: Lim, Scott Nathaniel
Other Authors: Andy Khong Wai Hoong
Format: Final Year Project
Language:English
Published: 2018
Subjects:
Online Access:http://hdl.handle.net/10356/74906
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-74906
record_format dspace
spelling sg-ntu-dr.10356-749062023-07-07T15:58:49Z Acoustic and video processing demonstration system Lim, Scott Nathaniel Andy Khong Wai Hoong School of Electrical and Electronic Engineering DRNTU::Engineering An acoustic and video processing demonstration system was developed. Direction of Arrival (DOA) estimation or sound source locations were overlaid on a video stream of the application, in real time. The application will implement the Steered Response Power (SRP) with Phase Transform (PHAT) or SRP-PHAT algorithm to generate power values from audio data of 40 channels. In general, the global maximum of these power values relate to the DOA. This technique was used due to its compatibility with Generalised Cross Correlation (GCC) and resilience to background noise. However, searching the power values from SRP-PHAT poses issues of long processing time and tendencies to peak at undesired local maxima. Thus, the Stochastic Region Contraction (SRC) method was employed for its performance improvements and effectiveness in searching for the power values’ global maxima. The algorithm was adapted from a MATLAB source, but due to software restrictions of the 40 channel microphone array, a conversion of MATLAB code to C++ was required. This code was converted manually. It was found that the converted code was not as reliable in providing accurate DOA results. Comparing Cosine Similarity Indexes (CSI) showed that the converted code produced a weak CSI of 57% similarity. Whereas, the MATLAB code had a high CSI of 87% similarity to a true DOA output. The low CSI percentage of the converted code in determining true DOAs, had justified the presence of conversion error. Hence, it was recommended that an automatic converter be emplaced, to avoid conversion errors. Bachelor of Engineering 2018-05-24T09:18:54Z 2018-05-24T09:18:54Z 2018 Final Year Project (FYP) http://hdl.handle.net/10356/74906 en Nanyang Technological University 74 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering
spellingShingle DRNTU::Engineering
Lim, Scott Nathaniel
Acoustic and video processing demonstration system
description An acoustic and video processing demonstration system was developed. Direction of Arrival (DOA) estimation or sound source locations were overlaid on a video stream of the application, in real time. The application will implement the Steered Response Power (SRP) with Phase Transform (PHAT) or SRP-PHAT algorithm to generate power values from audio data of 40 channels. In general, the global maximum of these power values relate to the DOA. This technique was used due to its compatibility with Generalised Cross Correlation (GCC) and resilience to background noise. However, searching the power values from SRP-PHAT poses issues of long processing time and tendencies to peak at undesired local maxima. Thus, the Stochastic Region Contraction (SRC) method was employed for its performance improvements and effectiveness in searching for the power values’ global maxima. The algorithm was adapted from a MATLAB source, but due to software restrictions of the 40 channel microphone array, a conversion of MATLAB code to C++ was required. This code was converted manually. It was found that the converted code was not as reliable in providing accurate DOA results. Comparing Cosine Similarity Indexes (CSI) showed that the converted code produced a weak CSI of 57% similarity. Whereas, the MATLAB code had a high CSI of 87% similarity to a true DOA output. The low CSI percentage of the converted code in determining true DOAs, had justified the presence of conversion error. Hence, it was recommended that an automatic converter be emplaced, to avoid conversion errors.
author2 Andy Khong Wai Hoong
author_facet Andy Khong Wai Hoong
Lim, Scott Nathaniel
format Final Year Project
author Lim, Scott Nathaniel
author_sort Lim, Scott Nathaniel
title Acoustic and video processing demonstration system
title_short Acoustic and video processing demonstration system
title_full Acoustic and video processing demonstration system
title_fullStr Acoustic and video processing demonstration system
title_full_unstemmed Acoustic and video processing demonstration system
title_sort acoustic and video processing demonstration system
publishDate 2018
url http://hdl.handle.net/10356/74906
_version_ 1772828621605437440