DEVELOPMENT OF BAND-SPLIT RNN AND HYBRID TRANSFORMER DEMUCSFOR MUSIC SOURCE SEPARATION

In recent years, models have been developed in the field of music source separation (MSS). The current state-of-the-art models are Hybrid Transformer Demucs (HT Demucs) and Band-Split RNN (BSRNN). Recent research shows that the pre- trained HT Demucs model can separate six sources (drums, bass,...

Full description

Saved in:

Bibliographic Details
Main Author:	Kalang Al Qalyubi, Ken
Format:	Final Project
Language:	Indonesia
Online Access:	https://digilib.itb.ac.id/gdl/view/85018
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Institut Teknologi Bandung
Language:	Indonesia

id	id-itb.:85018
spelling	id-itb.:850182024-08-19T13:18:04ZDEVELOPMENT OF BAND-SPLIT RNN AND HYBRID TRANSFORMER DEMUCSFOR MUSIC SOURCE SEPARATION Kalang Al Qalyubi, Ken Indonesia Final Project MSS, BSRNN, HT Demucs, 6 stems, MoisesDB INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/85018 In recent years, models have been developed in the field of music source separation (MSS). The current state-of-the-art models are Hybrid Transformer Demucs (HT Demucs) and Band-Split RNN (BSRNN). Recent research shows that the pre- trained HT Demucs model can separate six sources (drums, bass, guitar, piano, vocals, and others), tested using the MoisesDB dataset, but scores relatively low on the guitar, piano, and other sources compared to bass, drums, and vocals sources, measured by the utterance-level Signal-to-Distortion (uSDR) metric. However, no research has yet demonstrated the performance of the BSRNN model in separating these six sources. This thesis aims to investigate the performance of the BSRNN and HT Demucs models in separating six sources. For this purpose, BSRNN and HT Demucs models were developed for six-source separation using the MoisesDB dataset. These two models were then evaluated and analyzed to determine the best model for six-source separation. Experimental results show that the HT Demucs model excels in separating all sources compared to the BSRNN model, measured on the uSDR and cSDR metrics with averages of 6.26 dB and 5.88 dB respectively for the HT Demucs model, while the BSRNN model achieved scores of 5.52 dB and 5.38 dB. Additionally, the trained HT Demucs model outperformed the pre-trained HT Demucs model on the piano and other sources by differences of 1 dB and 0.3 dB respectively. text
institution	Institut Teknologi Bandung
building	Institut Teknologi Bandung Library
continent	Asia
country	Indonesia Indonesia
content_provider	Institut Teknologi Bandung
collection	Digital ITB
language	Indonesia
description	In recent years, models have been developed in the field of music source separation (MSS). The current state-of-the-art models are Hybrid Transformer Demucs (HT Demucs) and Band-Split RNN (BSRNN). Recent research shows that the pre- trained HT Demucs model can separate six sources (drums, bass, guitar, piano, vocals, and others), tested using the MoisesDB dataset, but scores relatively low on the guitar, piano, and other sources compared to bass, drums, and vocals sources, measured by the utterance-level Signal-to-Distortion (uSDR) metric. However, no research has yet demonstrated the performance of the BSRNN model in separating these six sources. This thesis aims to investigate the performance of the BSRNN and HT Demucs models in separating six sources. For this purpose, BSRNN and HT Demucs models were developed for six-source separation using the MoisesDB dataset. These two models were then evaluated and analyzed to determine the best model for six-source separation. Experimental results show that the HT Demucs model excels in separating all sources compared to the BSRNN model, measured on the uSDR and cSDR metrics with averages of 6.26 dB and 5.88 dB respectively for the HT Demucs model, while the BSRNN model achieved scores of 5.52 dB and 5.38 dB. Additionally, the trained HT Demucs model outperformed the pre-trained HT Demucs model on the piano and other sources by differences of 1 dB and 0.3 dB respectively.
format	Final Project
author	Kalang Al Qalyubi, Ken
spellingShingle	Kalang Al Qalyubi, Ken DEVELOPMENT OF BAND-SPLIT RNN AND HYBRID TRANSFORMER DEMUCSFOR MUSIC SOURCE SEPARATION
author_facet	Kalang Al Qalyubi, Ken
author_sort	Kalang Al Qalyubi, Ken
title	DEVELOPMENT OF BAND-SPLIT RNN AND HYBRID TRANSFORMER DEMUCSFOR MUSIC SOURCE SEPARATION
title_short	DEVELOPMENT OF BAND-SPLIT RNN AND HYBRID TRANSFORMER DEMUCSFOR MUSIC SOURCE SEPARATION
title_full	DEVELOPMENT OF BAND-SPLIT RNN AND HYBRID TRANSFORMER DEMUCSFOR MUSIC SOURCE SEPARATION
title_fullStr	DEVELOPMENT OF BAND-SPLIT RNN AND HYBRID TRANSFORMER DEMUCSFOR MUSIC SOURCE SEPARATION
title_full_unstemmed	DEVELOPMENT OF BAND-SPLIT RNN AND HYBRID TRANSFORMER DEMUCSFOR MUSIC SOURCE SEPARATION
title_sort	development of band-split rnn and hybrid transformer demucsfor music source separation
url	https://digilib.itb.ac.id/gdl/view/85018
_version_	1823657349179506688

DEVELOPMENT OF BAND-SPLIT RNN AND HYBRID TRANSFORMER DEMUCSFOR MUSIC SOURCE SEPARATION

Similar Items