Probably pleasant? A neural-probabilistic approach to automatic masker selection for urban soundscape augmentation
Soundscape augmentation, which involves the addition of sounds known as “maskers” to a given soundscape, is a human-centric urban noise mitigation measure aimed at improving the overall soundscape quality. However, the choice of maskers is often predicated on laborious processes and is inflexible to...
Saved in:
Main Authors: | , , , , |
---|---|
Other Authors: | |
Format: | Conference or Workshop Item |
Language: | English |
Published: |
2022
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/158000 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-158000 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1580002022-05-16T08:01:50Z Probably pleasant? A neural-probabilistic approach to automatic masker selection for urban soundscape augmentation Ooi, Kenenth Watcharasupat, Karn N. Lam, Bhan Ong, Zhen-Ting Gan, Woon-Seng School of Electrical and Electronic Engineering 2022 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2022) Engineering::Electrical and electronic engineering Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Science::Physics::Acoustics Social sciences::Psychology::Applied psychology Soundscape Neural Attention Soundscape Augmentation Deep Learning Probabilistic Model Soundscape augmentation, which involves the addition of sounds known as “maskers” to a given soundscape, is a human-centric urban noise mitigation measure aimed at improving the overall soundscape quality. However, the choice of maskers is often predicated on laborious processes and is inflexible to the time-varying nature of real-world soundscapes. Owing to the perceptual uniqueness of each soundscape and the inherent subjectiveness of human perception, we propose a probabilistic perceptual attribute predictor (PPAP) that predicts parameters of random distributions as outputs instead of a single deterministic value. Using the PPAP, we developed a novel automatic masker selection system (AMSS), which selects optimal masker candidates based on the predicted distribution of the ISO 12913-3 Pleasantness score for a given soundscape. Via a large-scale listening test with 300 participants, we collected 12600 subjective responses, each to a unique augmented soundscape, to train the PPAP models in a 5-fold cross-validation scheme. Using a convolutional recurrent neural network backbone and experimenting with several variants of the attention mechanism for the PPAP, we evaluated the proposed system on a blind test set with 48 unseen augmented soundscapes to assess the effectiveness of the probabilistic output scheme over traditional deterministic systems. Ministry of National Development (MND) National Research Foundation (NRF) Submitted/Accepted version This research is supported by the Singapore Ministry of National Development and the National Research Foundation, Prime Minister’s Office under the Cities of Tomorrow Research Programme (Award No. COT-V4-2020-1), and the Google Cloud Research Credits Program (GCP205231017). 2022-05-16T08:01:50Z 2022-05-16T08:01:50Z 2022 Conference Paper Ooi, K., Watcharasupat, K. N., Lam, B., Ong, Z. & Gan, W. (2022). Probably pleasant? A neural-probabilistic approach to automatic masker selection for urban soundscape augmentation. 2022 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2022), 8887-8891. https://dx.doi.org/10.1109/ICASSP43922.2022.9746897 978-1-6654-0540-9 2379-190X https://hdl.handle.net/10356/158000 10.1109/ICASSP43922.2022.9746897 8887 8891 en CoT-V4-2020-1 GCP205231017 10.21979/N9/YSJQKD © 2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: https://doi.org/10.1109/ICASSP43922.2022.9746897. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Electrical and electronic engineering Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Science::Physics::Acoustics Social sciences::Psychology::Applied psychology Soundscape Neural Attention Soundscape Augmentation Deep Learning Probabilistic Model |
spellingShingle |
Engineering::Electrical and electronic engineering Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Science::Physics::Acoustics Social sciences::Psychology::Applied psychology Soundscape Neural Attention Soundscape Augmentation Deep Learning Probabilistic Model Ooi, Kenenth Watcharasupat, Karn N. Lam, Bhan Ong, Zhen-Ting Gan, Woon-Seng Probably pleasant? A neural-probabilistic approach to automatic masker selection for urban soundscape augmentation |
description |
Soundscape augmentation, which involves the addition of sounds known as “maskers” to a given soundscape, is a human-centric urban noise mitigation measure aimed at improving the overall soundscape quality. However, the choice of maskers is often predicated on laborious processes and is inflexible to the time-varying nature of real-world soundscapes. Owing to the perceptual uniqueness of each soundscape and the inherent subjectiveness of human perception, we propose a probabilistic perceptual attribute predictor (PPAP) that predicts parameters of random distributions as outputs instead of a single deterministic value. Using the PPAP, we developed a novel
automatic masker selection system (AMSS), which selects optimal masker candidates based on the predicted distribution of the ISO 12913-3 Pleasantness score for a given soundscape. Via a large-scale listening test with 300 participants, we collected 12600 subjective responses, each to a unique augmented soundscape, to train the PPAP models in a 5-fold cross-validation scheme. Using a convolutional recurrent neural network backbone and experimenting with several variants of the attention mechanism for the PPAP, we evaluated the proposed system on a blind test set with 48 unseen augmented soundscapes to assess the effectiveness of the probabilistic output scheme over traditional deterministic systems. |
author2 |
School of Electrical and Electronic Engineering |
author_facet |
School of Electrical and Electronic Engineering Ooi, Kenenth Watcharasupat, Karn N. Lam, Bhan Ong, Zhen-Ting Gan, Woon-Seng |
format |
Conference or Workshop Item |
author |
Ooi, Kenenth Watcharasupat, Karn N. Lam, Bhan Ong, Zhen-Ting Gan, Woon-Seng |
author_sort |
Ooi, Kenenth |
title |
Probably pleasant? A neural-probabilistic approach to automatic masker selection for urban soundscape augmentation |
title_short |
Probably pleasant? A neural-probabilistic approach to automatic masker selection for urban soundscape augmentation |
title_full |
Probably pleasant? A neural-probabilistic approach to automatic masker selection for urban soundscape augmentation |
title_fullStr |
Probably pleasant? A neural-probabilistic approach to automatic masker selection for urban soundscape augmentation |
title_full_unstemmed |
Probably pleasant? A neural-probabilistic approach to automatic masker selection for urban soundscape augmentation |
title_sort |
probably pleasant? a neural-probabilistic approach to automatic masker selection for urban soundscape augmentation |
publishDate |
2022 |
url |
https://hdl.handle.net/10356/158000 |
_version_ |
1734310150876954624 |