Acoustic and video processing demonstration system

An acoustic and video processing demonstration system was developed. Direction of Arrival (DOA) estimation or sound source locations were overlaid on a video stream of the application, in real time. The application will implement the Steered Response Power (SRP) with Phase Transform (PHAT) or SRP-PH...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Lim, Scott Nathaniel
مؤلفون آخرون: Andy Khong Wai Hoong
التنسيق: Final Year Project
اللغة:English
منشور في: 2018
الموضوعات:
الوصول للمادة أونلاين:http://hdl.handle.net/10356/74906
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
المؤسسة: Nanyang Technological University
اللغة: English
الوصف
الملخص:An acoustic and video processing demonstration system was developed. Direction of Arrival (DOA) estimation or sound source locations were overlaid on a video stream of the application, in real time. The application will implement the Steered Response Power (SRP) with Phase Transform (PHAT) or SRP-PHAT algorithm to generate power values from audio data of 40 channels. In general, the global maximum of these power values relate to the DOA. This technique was used due to its compatibility with Generalised Cross Correlation (GCC) and resilience to background noise. However, searching the power values from SRP-PHAT poses issues of long processing time and tendencies to peak at undesired local maxima. Thus, the Stochastic Region Contraction (SRC) method was employed for its performance improvements and effectiveness in searching for the power values’ global maxima. The algorithm was adapted from a MATLAB source, but due to software restrictions of the 40 channel microphone array, a conversion of MATLAB code to C++ was required. This code was converted manually. It was found that the converted code was not as reliable in providing accurate DOA results. Comparing Cosine Similarity Indexes (CSI) showed that the converted code produced a weak CSI of 57% similarity. Whereas, the MATLAB code had a high CSI of 87% similarity to a true DOA output. The low CSI percentage of the converted code in determining true DOAs, had justified the presence of conversion error. Hence, it was recommended that an automatic converter be emplaced, to avoid conversion errors.