Ensemble of pruned models for low-complexity acoustic scene classification

For the DCASE 2020 Challenge, the focus of Task 1B is to develop low-complexity models for classification of 3 different types of acoustic scenes, which have potential applications in resource-scarce edge devices deployed in a large-scale acoustic network. In this paper, we present the training meth...

Full description

Saved in:
Bibliographic Details
Main Authors: Ooi, Kenneth, Peksi, Santi, Gan, Woon-Seng
Other Authors: School of Electrical and Electronic Engineering
Format: Conference or Workshop Item
Language:English
Published: 2021
Subjects:
Online Access:https://hdl.handle.net/10356/148327
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:For the DCASE 2020 Challenge, the focus of Task 1B is to develop low-complexity models for classification of 3 different types of acoustic scenes, which have potential applications in resource-scarce edge devices deployed in a large-scale acoustic network. In this paper, we present the training methodology for our submissions for the challenge, with the best-performing system consisting of an ensemble of VGGNet- and Inception-Net-based lightweight classification models. The subsystems in the ensemble classifier were pruned by setting low-magnitude weights periodically to zero with a polynomial decay schedule to achieve an 80% reduction in individual subsystem size. The resultant ensemble classifier outperformed the baseline model on the validation set over 10 runs and had 119758 non-zero parameters taking up 468KB of memory. This shows the efficacy of the pruning technique used. We also performed experiments to compare the performance of various data augmentation schemes, input feature representations, and model architectures in our training methodology. No external data was used, and source code for the submission can be found at https://github.com/kenowr/DCASE-2020-Task-1B.