Supervised contrastive pretrained ResNet with MixUp to enhance respiratory sound classification on imbalanced and limited dataset

This paper proposes a strategy of combining multiple techniques to classify paediatric respiratory sound (PRS) from the Open-Source SJTU Paediatric Respiratory Sound Database. Inspired by recent successes in image classification, this work focuses on improving audio classification with limited and i...

全面介紹

Saved in:
書目詳細資料
Main Authors: Hu, Jinhai, Leow, Cong Sheng, Tao, Shuailin, Goh, Wang Ling, Gao, Yuan
其他作者: School of Electrical and Electronic Engineering
格式: Conference or Workshop Item
語言:English
出版: 2024
主題:
在線閱讀:https://hdl.handle.net/10356/179114
https://ieeexplore.ieee.org/abstract/document/10389029
標簽: 添加標簽
沒有標簽, 成為第一個標記此記錄!
機構: Nanyang Technological University
語言: English
實物特徵
總結:This paper proposes a strategy of combining multiple techniques to classify paediatric respiratory sound (PRS) from the Open-Source SJTU Paediatric Respiratory Sound Database. Inspired by recent successes in image classification, this work focuses on improving audio classification with limited and imbalanced datasets through Residual Networks (ResNet). These techniques include augmentations applied to audio features, supervised contrastive (SupCon) pretraining, and MixUp. These three techniques helped reduced overfitting due to imbalanced dataset. To further enhance accuracy, pre-processing, and training hyperparameters were optimized through Bayesian Optimization. The proposed strategy achieved over 95% training accuracies for the four tasks (11, 12, 21, and 22) in the IEEE BioCAS 2023 grand challenge. Through this strategy, the four tasks achieved calculated scores of 0.769, 0.632, 0.662 and 0.512 respectively using the test dataset. The total score is 0.729 including 0.1 obtained from the runtime bonus.