Acoustic and video processing demonstration system

An acoustic and video processing demonstration system was developed. Direction of Arrival (DOA) estimation or sound source locations were overlaid on a video stream of the application, in real time. The application will implement the Steered Response Power (SRP) with Phase Transform (PHAT) or SRP-PH...

Full description

Saved in:

Bibliographic Details
Main Author:	Lim, Scott Nathaniel
Other Authors:	Andy Khong Wai Hoong
Format:	Final Year Project
Language:	English
Published:	2018
Subjects:	DRNTU::Engineering
Online Access:	http://hdl.handle.net/10356/74906
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-74906
record_format	dspace
spelling	sg-ntu-dr.10356-749062023-07-07T15:58:49Z Acoustic and video processing demonstration system Lim, Scott Nathaniel Andy Khong Wai Hoong School of Electrical and Electronic Engineering DRNTU::Engineering An acoustic and video processing demonstration system was developed. Direction of Arrival (DOA) estimation or sound source locations were overlaid on a video stream of the application, in real time. The application will implement the Steered Response Power (SRP) with Phase Transform (PHAT) or SRP-PHAT algorithm to generate power values from audio data of 40 channels. In general, the global maximum of these power values relate to the DOA. This technique was used due to its compatibility with Generalised Cross Correlation (GCC) and resilience to background noise. However, searching the power values from SRP-PHAT poses issues of long processing time and tendencies to peak at undesired local maxima. Thus, the Stochastic Region Contraction (SRC) method was employed for its performance improvements and effectiveness in searching for the power values’ global maxima. The algorithm was adapted from a MATLAB source, but due to software restrictions of the 40 channel microphone array, a conversion of MATLAB code to C++ was required. This code was converted manually. It was found that the converted code was not as reliable in providing accurate DOA results. Comparing Cosine Similarity Indexes (CSI) showed that the converted code produced a weak CSI of 57% similarity. Whereas, the MATLAB code had a high CSI of 87% similarity to a true DOA output. The low CSI percentage of the converted code in determining true DOAs, had justified the presence of conversion error. Hence, it was recommended that an automatic converter be emplaced, to avoid conversion errors. Bachelor of Engineering 2018-05-24T09:18:54Z 2018-05-24T09:18:54Z 2018 Final Year Project (FYP) http://hdl.handle.net/10356/74906 en Nanyang Technological University 74 p. application/pdf
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	DRNTU::Engineering
spellingShingle	DRNTU::Engineering Lim, Scott Nathaniel Acoustic and video processing demonstration system
description	An acoustic and video processing demonstration system was developed. Direction of Arrival (DOA) estimation or sound source locations were overlaid on a video stream of the application, in real time. The application will implement the Steered Response Power (SRP) with Phase Transform (PHAT) or SRP-PHAT algorithm to generate power values from audio data of 40 channels. In general, the global maximum of these power values relate to the DOA. This technique was used due to its compatibility with Generalised Cross Correlation (GCC) and resilience to background noise. However, searching the power values from SRP-PHAT poses issues of long processing time and tendencies to peak at undesired local maxima. Thus, the Stochastic Region Contraction (SRC) method was employed for its performance improvements and effectiveness in searching for the power values’ global maxima. The algorithm was adapted from a MATLAB source, but due to software restrictions of the 40 channel microphone array, a conversion of MATLAB code to C++ was required. This code was converted manually. It was found that the converted code was not as reliable in providing accurate DOA results. Comparing Cosine Similarity Indexes (CSI) showed that the converted code produced a weak CSI of 57% similarity. Whereas, the MATLAB code had a high CSI of 87% similarity to a true DOA output. The low CSI percentage of the converted code in determining true DOAs, had justified the presence of conversion error. Hence, it was recommended that an automatic converter be emplaced, to avoid conversion errors.
author2	Andy Khong Wai Hoong
author_facet	Andy Khong Wai Hoong Lim, Scott Nathaniel
format	Final Year Project
author	Lim, Scott Nathaniel
author_sort	Lim, Scott Nathaniel
title	Acoustic and video processing demonstration system
title_short	Acoustic and video processing demonstration system
title_full	Acoustic and video processing demonstration system
title_fullStr	Acoustic and video processing demonstration system
title_full_unstemmed	Acoustic and video processing demonstration system
title_sort	acoustic and video processing demonstration system
publishDate	2018
url	http://hdl.handle.net/10356/74906
_version_	1772828621605437440

Acoustic and video processing demonstration system

Similar Items