HARMONIZATION OF WORD AND MELODY: INDONESIAN SONG LYRICS ALIGNMENT USING PHONEME REPRESENTATION METHOD

Song lyrics play an crucial role in music, providing deep meaning and emotion to listeners. However, aligning lyrics with the rhythm of music is a significant challenge. This study focuses on developing a model for aligning Indonesian song lyrics using artificial intelligence approaches. This st...

Full description

Saved in:

Bibliographic Details
Main Author:	Stefanus
Format:	Theses
Language:	Indonesia
Online Access:	https://digilib.itb.ac.id/gdl/view/85320
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Institut Teknologi Bandung
Language:	Indonesia

id	id-itb.:85320
spelling	id-itb.:853202024-08-20T10:21:46ZHARMONIZATION OF WORD AND MELODY: INDONESIAN SONG LYRICS ALIGNMENT USING PHONEME REPRESENTATION METHOD Stefanus Indonesia Theses lyric alignment, music rhythm, forced alignment, speech processing, Indonesian language, artificial intelligence INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/85320 Song lyrics play an crucial role in music, providing deep meaning and emotion to listeners. However, aligning lyrics with the rhythm of music is a significant challenge. This study focuses on developing a model for aligning Indonesian song lyrics using artificial intelligence approaches. This study adopts forced alignment techniques that have been widely used in aligning automatic speech recognition results with audio. Forced alignment is a technique used to place phonemes, words, or phrases onto a corresponding timeline. However, the application of this technique in the domain of music and Indonesian language is very limited. Therefore, this study aims to explore how voice processing technology can be used to align text lyrics with Indonesian musical rhythms. The research involves several stages, starting from web scraping for collecting a dataset of Indonesian songs to employing the SEMMA methodology (Sample, Explore, Modify, Model, Assess) for the development of the forced alignment model. The results indicate that the proposed approach, which includes phoneme translation and transfer learning with the Hidden Markov Model - Gaussian Mixture Model (HMM-GMM), yields better outcomes compared to commonly used forced alignment models such as NeMo Forced Aligner (NFA) and Massively Multilingual Speech – Forced Alignment (MMS-FA). In terms of the Mean Average Error (MAE) metric, the proposed model achieved an average value of 947.86 milliseconds, while in the Segment Error Rate (SER) metric, the model reached a result of 0.0016 (~0.1%). These results demonstrate that the developed model can align Indonesian song lyrics more accurately than the NFA model (MAE=1742.46 milliseconds, SER=0.0740) and the MMS-FA model (MAE=1945.82 milliseconds, SER=0.1609). text
institution	Institut Teknologi Bandung
building	Institut Teknologi Bandung Library
continent	Asia
country	Indonesia Indonesia
content_provider	Institut Teknologi Bandung
collection	Digital ITB
language	Indonesia
description	Song lyrics play an crucial role in music, providing deep meaning and emotion to listeners. However, aligning lyrics with the rhythm of music is a significant challenge. This study focuses on developing a model for aligning Indonesian song lyrics using artificial intelligence approaches. This study adopts forced alignment techniques that have been widely used in aligning automatic speech recognition results with audio. Forced alignment is a technique used to place phonemes, words, or phrases onto a corresponding timeline. However, the application of this technique in the domain of music and Indonesian language is very limited. Therefore, this study aims to explore how voice processing technology can be used to align text lyrics with Indonesian musical rhythms. The research involves several stages, starting from web scraping for collecting a dataset of Indonesian songs to employing the SEMMA methodology (Sample, Explore, Modify, Model, Assess) for the development of the forced alignment model. The results indicate that the proposed approach, which includes phoneme translation and transfer learning with the Hidden Markov Model - Gaussian Mixture Model (HMM-GMM), yields better outcomes compared to commonly used forced alignment models such as NeMo Forced Aligner (NFA) and Massively Multilingual Speech – Forced Alignment (MMS-FA). In terms of the Mean Average Error (MAE) metric, the proposed model achieved an average value of 947.86 milliseconds, while in the Segment Error Rate (SER) metric, the model reached a result of 0.0016 (~0.1%). These results demonstrate that the developed model can align Indonesian song lyrics more accurately than the NFA model (MAE=1742.46 milliseconds, SER=0.0740) and the MMS-FA model (MAE=1945.82 milliseconds, SER=0.1609).
format	Theses
author	Stefanus
spellingShingle	Stefanus HARMONIZATION OF WORD AND MELODY: INDONESIAN SONG LYRICS ALIGNMENT USING PHONEME REPRESENTATION METHOD
author_facet	Stefanus
author_sort	Stefanus
title	HARMONIZATION OF WORD AND MELODY: INDONESIAN SONG LYRICS ALIGNMENT USING PHONEME REPRESENTATION METHOD
title_short	HARMONIZATION OF WORD AND MELODY: INDONESIAN SONG LYRICS ALIGNMENT USING PHONEME REPRESENTATION METHOD
title_full	HARMONIZATION OF WORD AND MELODY: INDONESIAN SONG LYRICS ALIGNMENT USING PHONEME REPRESENTATION METHOD
title_fullStr	HARMONIZATION OF WORD AND MELODY: INDONESIAN SONG LYRICS ALIGNMENT USING PHONEME REPRESENTATION METHOD
title_full_unstemmed	HARMONIZATION OF WORD AND MELODY: INDONESIAN SONG LYRICS ALIGNMENT USING PHONEME REPRESENTATION METHOD
title_sort	harmonization of word and melody: indonesian song lyrics alignment using phoneme representation method
url	https://digilib.itb.ac.id/gdl/view/85320
_version_	1822999130973143040

HARMONIZATION OF WORD AND MELODY: INDONESIAN SONG LYRICS ALIGNMENT USING PHONEME REPRESENTATION METHOD

Similar Items