Music generation with deep learning techniques

This project studies various models for deep learning music generation, and investigates a novel approach to music generation that utilises image content as an input. We use Gemini, a large language model, to generate textual captions describing the image's content and emotional tone. These cap...

Full description

Saved in:
Bibliographic Details
Main Author: Chong, Yi An
Other Authors: Alexei Sourin
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2024
Subjects:
AI
Online Access:https://hdl.handle.net/10356/175381
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-175381
record_format dspace
spelling sg-ntu-dr.10356-1753812024-04-26T15:44:59Z Music generation with deep learning techniques Chong, Yi An Alexei Sourin School of Computer Science and Engineering assourin@ntu.edu.sg Computer and Information Science Engineering AI Deep learning Music This project studies various models for deep learning music generation, and investigates a novel approach to music generation that utilises image content as an input. We use Gemini, a large language model, to generate textual captions describing the image's content and emotional tone. These captions are then fed into the existing MusicGen framework, traditionally designed for text-based music generation. While our evaluation shows promise, with generated music thematically and emotionally aligned with the corresponding image, the melodic structure remains a challenge. This suggests potential limitations in using plain text captions as input for MusicGen. Our findings pave the way for further exploration of alternative representations that could directly translate image features into musical elements. This could involve delving into image processing techniques or developing specialised music generation models that handle image data more effectively. Overall, this project demonstrates the potential of image-based music generation and highlights the need for future research in this exciting area. Bachelor's degree 2024-04-23T23:31:19Z 2024-04-23T23:31:19Z 2024 Final Year Project (FYP) Chong, Y. A. (2024). Music generation with deep learning techniques. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/175381 https://hdl.handle.net/10356/175381 en application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Computer and Information Science
Engineering
AI
Deep learning
Music
spellingShingle Computer and Information Science
Engineering
AI
Deep learning
Music
Chong, Yi An
Music generation with deep learning techniques
description This project studies various models for deep learning music generation, and investigates a novel approach to music generation that utilises image content as an input. We use Gemini, a large language model, to generate textual captions describing the image's content and emotional tone. These captions are then fed into the existing MusicGen framework, traditionally designed for text-based music generation. While our evaluation shows promise, with generated music thematically and emotionally aligned with the corresponding image, the melodic structure remains a challenge. This suggests potential limitations in using plain text captions as input for MusicGen. Our findings pave the way for further exploration of alternative representations that could directly translate image features into musical elements. This could involve delving into image processing techniques or developing specialised music generation models that handle image data more effectively. Overall, this project demonstrates the potential of image-based music generation and highlights the need for future research in this exciting area.
author2 Alexei Sourin
author_facet Alexei Sourin
Chong, Yi An
format Final Year Project
author Chong, Yi An
author_sort Chong, Yi An
title Music generation with deep learning techniques
title_short Music generation with deep learning techniques
title_full Music generation with deep learning techniques
title_fullStr Music generation with deep learning techniques
title_full_unstemmed Music generation with deep learning techniques
title_sort music generation with deep learning techniques
publisher Nanyang Technological University
publishDate 2024
url https://hdl.handle.net/10356/175381
_version_ 1800916120326111232