A benchmark comparison of perceptual models for soundscapes on a large-scale augmented soundscape dataset
Psychoacoustic indicators and spectrogram-based features are standard inputs to perceptual models for soundscape analysis. However, existing models in the soundscape literature are trained on different collections of input parameters and frequently mutually exclusive datasets, which complicates fair...
Saved in:
Main Authors: | , , , , |
---|---|
Other Authors: | |
Format: | Conference or Workshop Item |
Language: | English |
Published: |
2023
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/164983 https://www.ica2022korea.org/data/Proceedings_A14.pdf https://ica2022korea.org/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-164983 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1649832023-04-28T15:44:33Z A benchmark comparison of perceptual models for soundscapes on a large-scale augmented soundscape dataset Ooi, Kenneth Watcharasupat, Karn N. Lam, Bhan Ong, Zhen-Ting Gan, Woon-Seng School of Electrical and Electronic Engineering 24th International Congress on Acoustics (ICA 2022) Digital Signal Processing Laboratory Engineering::Electrical and electronic engineering Soundscape Augmentation Dataset Regression Classification Deep Neural Networks Psychoacoustic indicators and spectrogram-based features are standard inputs to perceptual models for soundscape analysis. However, existing models in the soundscape literature are trained on different collections of input parameters and frequently mutually exclusive datasets, which complicates fair comparisons of model performance and generalizability to unseen data and soundscapes. Hence, we use the ARAUS dataset, a large-scale, publicly-available dataset of 25,440 responses to unique augmented soundscapes, as a common benchmark for comparison of the relative performance of a selection of input and model types used in previous soundscape studies, as well as deep neural network architectures commonly used for other acoustic tasks. The different model types were used in a regression task to predict “Eventfulness” ratings (as defined in ISO/TS 12913-3) given by participants as responses to the augmented soundscapes. Subsequently, we compared their performance as classification models for the classification of soundscapes into the quadrants generated by the Pleasantness-Eventfulness axes of the ISO/TS 12913-3 circumplex model. The five-fold cross-validation set of 25,200 responses and an independent test set of 240 responses making up the ARAUS dataset was used to facilitate unbiased comparisons. Ministry of National Development (MND) National Research Foundation (NRF) Published version This research is supported by the Singapore Ministry of National Development and the National Research Foundation, Prime Minister’s Office under the Cities of Tomorrow Research Programme (Award No. COT-V4-2020-1). Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not reflect the view of National Research Foundation, Singapore, and Ministry of National Development, Singapore. 2023-04-24T05:10:05Z 2023-04-24T05:10:05Z 2022 Conference Paper Ooi, K., Watcharasupat, K. N., Lam, B., Ong, Z. & Gan, W. (2022). A benchmark comparison of perceptual models for soundscapes on a large-scale augmented soundscape dataset. 24th International Congress on Acoustics (ICA 2022). https://hdl.handle.net/10356/164983 https://www.ica2022korea.org/data/Proceedings_A14.pdf https://ica2022korea.org/ en COT-V4-2020-1 10.21979/N9/9OTEVX © 2022 The Author(s). Published by Acoustical Society of Kore. All rights reserved. This paper was published in Proceedings of the 24th International Congress on Acoustics (ICA 2022) and is made available with permission of The Author(s). application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Electrical and electronic engineering Soundscape Augmentation Dataset Regression Classification Deep Neural Networks |
spellingShingle |
Engineering::Electrical and electronic engineering Soundscape Augmentation Dataset Regression Classification Deep Neural Networks Ooi, Kenneth Watcharasupat, Karn N. Lam, Bhan Ong, Zhen-Ting Gan, Woon-Seng A benchmark comparison of perceptual models for soundscapes on a large-scale augmented soundscape dataset |
description |
Psychoacoustic indicators and spectrogram-based features are standard inputs to perceptual models for soundscape analysis. However, existing models in the soundscape literature are trained on different collections of input parameters and frequently mutually exclusive datasets, which complicates fair comparisons of model performance and generalizability to unseen data and soundscapes. Hence, we use the ARAUS dataset, a large-scale, publicly-available dataset of 25,440 responses to unique augmented soundscapes, as a common benchmark for comparison of the relative performance of a selection of input and model types used in previous soundscape studies, as well as deep neural network architectures commonly used for other acoustic tasks. The different model types were used in a regression task to predict “Eventfulness” ratings (as defined in ISO/TS 12913-3) given by participants as responses to the augmented soundscapes. Subsequently, we compared their performance as classification models for the classification of soundscapes into the quadrants generated by the Pleasantness-Eventfulness axes of the ISO/TS 12913-3 circumplex model. The five-fold cross-validation set of 25,200 responses and an independent test set of 240 responses making up the ARAUS dataset was used to facilitate unbiased comparisons. |
author2 |
School of Electrical and Electronic Engineering |
author_facet |
School of Electrical and Electronic Engineering Ooi, Kenneth Watcharasupat, Karn N. Lam, Bhan Ong, Zhen-Ting Gan, Woon-Seng |
format |
Conference or Workshop Item |
author |
Ooi, Kenneth Watcharasupat, Karn N. Lam, Bhan Ong, Zhen-Ting Gan, Woon-Seng |
author_sort |
Ooi, Kenneth |
title |
A benchmark comparison of perceptual models for soundscapes on a large-scale augmented soundscape dataset |
title_short |
A benchmark comparison of perceptual models for soundscapes on a large-scale augmented soundscape dataset |
title_full |
A benchmark comparison of perceptual models for soundscapes on a large-scale augmented soundscape dataset |
title_fullStr |
A benchmark comparison of perceptual models for soundscapes on a large-scale augmented soundscape dataset |
title_full_unstemmed |
A benchmark comparison of perceptual models for soundscapes on a large-scale augmented soundscape dataset |
title_sort |
benchmark comparison of perceptual models for soundscapes on a large-scale augmented soundscape dataset |
publishDate |
2023 |
url |
https://hdl.handle.net/10356/164983 https://www.ica2022korea.org/data/Proceedings_A14.pdf https://ica2022korea.org/ |
_version_ |
1765213830558777344 |