Music generation with deep learning techniques

This research paper studies the development and performance of a Text-to-Music Transformer model. The main objective is to investigate the generative potential of the multimodal transformation, where textual input is converted into musical scores in MIDI format. A comprehensive literature review on...

Full description

Saved in:

Bibliographic Details
Main Author:	Low, Paul Solomon Si En
Other Authors:	Alexei Sourin
Format:	Final Year Project
Language:	English
Published:	Nanyang Technological University 2024
Subjects:	Computer and Information Science Deep learning Music generation Transformers Music to text Neural networks
Online Access:	https://hdl.handle.net/10356/175113
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-175113
record_format	dspace
spelling	sg-ntu-dr.10356-1751132024-04-26T15:40:29Z Music generation with deep learning techniques Low, Paul Solomon Si En Alexei Sourin School of Computer Science and Engineering assourin@ntu.edu.sg Computer and Information Science Deep learning Music generation Transformers Music to text Neural networks This research paper studies the development and performance of a Text-to-Music Transformer model. The main objective is to investigate the generative potential of the multimodal transformation, where textual input is converted into musical scores in MIDI format. A comprehensive literature review on existing music synthesis methods forms the basis of this study. This study creates the textual dataset in a novel way by using CLaMP to select the top 30 textual descriptors of the music. A pre-trained RoBERTa model and Octuple tokenizers are used to process the text and musical scores respectively. Thereafter, this music transformer uses neural network architectures with a Fast Transformer base to facilitate the infusion of textual information into generated sequences. Embeddings, linear layers, and cross-entropy loss calculations are used for all 6 musical attributes, with hyperparameter training to promote coherent and varied musical outputs. The generated music was evaluated with a musical analysis and a user study. The results verify that the transformer model can generate music that is either melodious or expresses the textual prompt. Bachelor's degree 2024-04-22T00:37:31Z 2024-04-22T00:37:31Z 2024 Final Year Project (FYP) Low, P. S. S. E. (2024). Music generation with deep learning techniques. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/175113 https://hdl.handle.net/10356/175113 en SCSE23-0041 application/pdf Nanyang Technological University
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Computer and Information Science Deep learning Music generation Transformers Music to text Neural networks
spellingShingle	Computer and Information Science Deep learning Music generation Transformers Music to text Neural networks Low, Paul Solomon Si En Music generation with deep learning techniques
description	This research paper studies the development and performance of a Text-to-Music Transformer model. The main objective is to investigate the generative potential of the multimodal transformation, where textual input is converted into musical scores in MIDI format. A comprehensive literature review on existing music synthesis methods forms the basis of this study. This study creates the textual dataset in a novel way by using CLaMP to select the top 30 textual descriptors of the music. A pre-trained RoBERTa model and Octuple tokenizers are used to process the text and musical scores respectively. Thereafter, this music transformer uses neural network architectures with a Fast Transformer base to facilitate the infusion of textual information into generated sequences. Embeddings, linear layers, and cross-entropy loss calculations are used for all 6 musical attributes, with hyperparameter training to promote coherent and varied musical outputs. The generated music was evaluated with a musical analysis and a user study. The results verify that the transformer model can generate music that is either melodious or expresses the textual prompt.
author2	Alexei Sourin
author_facet	Alexei Sourin Low, Paul Solomon Si En
format	Final Year Project
author	Low, Paul Solomon Si En
author_sort	Low, Paul Solomon Si En
title	Music generation with deep learning techniques
title_short	Music generation with deep learning techniques
title_full	Music generation with deep learning techniques
title_fullStr	Music generation with deep learning techniques
title_full_unstemmed	Music generation with deep learning techniques
title_sort	music generation with deep learning techniques
publisher	Nanyang Technological University
publishDate	2024
url	https://hdl.handle.net/10356/175113
_version_	1800916396731793408

Music generation with deep learning techniques

Similar Items