3D audio reproduction : natural augmented reality headset and next generation entertainment system using wave field synthesis

Sound plays an important role in our day-to-day activities. We inherently use it for interacting, listening to music, watching movies in home or cinemas, playing video games, having video-conferencing, etc. The main purpose of 3D audio reproduction is to emulate a natural listening experience to...

Full description

Saved in:
Bibliographic Details
Main Author: Ranjan, Rishabh
Other Authors: Gan Woon Seng
Format: Theses and Dissertations
Language:English
Published: 2016
Subjects:
Online Access:https://hdl.handle.net/10356/67317
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Sound plays an important role in our day-to-day activities. We inherently use it for interacting, listening to music, watching movies in home or cinemas, playing video games, having video-conferencing, etc. The main purpose of 3D audio reproduction is to emulate a natural listening experience to the user via playback devices. Sound can be reproduced at listeners’ ears over either headphones or loudspeakers/ loudspeaker array. However, the rendering of sound to be played back over headphones and loudspeakers are very different. It is important for these two playback methods to faithfully reproduce the sounds to provide listener a natural listening experience. Headphones are mainly used for private listening, while loudspeakers (or loudspeaker arrays) are meant for shared listening among a group of listeners. This thesis focuses on both the headphones and loudspeakers based reproduction mechanisms with emphasis on augmented and virtual reality applications, respectively. The first part of the thesis investigates natural listening over headphones in augmented reality using adaptive filtering techniques. We developed a natural augmented reality (NAR) headset with two pairs of binaural microphones attached to open headphones (one internal and one external microphone on each side). This work focuses on enabling natural listening via adaptive equalization of headset to ensure that the virtual sounds are reproduced perceptually as close to real sounds as possible in any listener environment, while also being aware of the external sound sources. The key objective is to minimize the large localization errors (front-back confusions), in-head localization as well as the timbre differences between virtual and real sounds. Modified adaptive filtering based on filtered-x normalized least mean square (FxNLMS) algorithms is proposed in this work to adapt the headphone synthesized signals to sound exactly like physical sounds, while equalizing for the individual headphone response. The adaptive equalization is further extended for the case when external sounds also present. Results show that the proposed adaptive algorithm approaches the desired response with minimum mean square error and converges faster than the conventional FxNLMS algorithm. The proposed method is found to be equally effective in the presence of external sounds. Subjective test using individualized binaural room-related impulse responses shows that listeners could not distinguish between the real and virtual sounds most of the times. Next, we emphasize on the spatial sound reproduction in a home entertainment scenario using multi-channel loudspeaker setups. Spatial sound systems aim at creating realistic sound experience to the listeners with uniform sound fields in the entire listening area. Conventional surround sound systems, which are most widely used as home theater systems, are based on multi-channel stereophony, like 5.1, 10.2 and higher surround channel system. These systems require multiple loudspeakers to be placed in fixed configuration but often constrained by the room size and the best impression only achieved at the sweet spot. Sound field reproduction systems like wave field synthesis (WFS) is based on the principle of natural propagation of sound waves, and hence can create replica of true sound field uniformly over an extended listening area. WFS virtual sources are localized much accurately as compared to the stereophonic phantom sources. However, WFS based systems require hundreds of densely spaced loudspeakers enclosing the listener area and thus, difficult to realize in homes. In addition, practical approximations of WFS, such as finite, discrete and line array of loudspeakers limit, the performance of WFS with reduced listening area, sound coloration and horizontal plane only reproduction. Therefore, a combination of WFS and binaural synthesis over the NAR headset is proposed to overcome the practical and physical limitations of the WFS. The proposed hybrid system enables frontal loudspeaker array playback using WFS, which provides strong frontal localization cues, while rear and side auditory scene is played back via NAR headset using virtual WFS to complete an entire 360o auditory scene presentation. Furthermore, the use of virtual WFS over headphones helps in minimizing sound coloration above spatial aliasing frequency of physical array with the help of virtual densely spaced speaker array. Both objective and subjective experiments are carried out to evaluate the performance of the proposed setup. In particular, a detailed subjective study is carried out to investigate the performance of the proposed hybrid system with regard to sound localization and sound coloration. Finally, a fast and efficient real-time GPU based implementation of WFS is presented to enhance the system throughput by exploiting the inherent massive parallelism in WFS based system comprising hundreds of densely spaced loudspeakers. The main goal of this work is to develop a real-time high throughput scalable platform for the hybrid WFS setup, which would need multiple driving signals, as well as WFS synthesized binaural signals using virtual WFS at the same time. To summarize, in this thesis we aimed to reproduce natural listening over headphones for personal listening in augmented reality environment, as well as creating an immersive listening experience for user using WFS in home scenarios.