Music generation with deep learning techniques
This project studies various models for deep learning music generation, and investigates a novel approach to music generation that utilises image content as an input. We use Gemini, a large language model, to generate textual captions describing the image's content and emotional tone. These cap...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/175381 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-175381 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1753812024-04-26T15:44:59Z Music generation with deep learning techniques Chong, Yi An Alexei Sourin School of Computer Science and Engineering assourin@ntu.edu.sg Computer and Information Science Engineering AI Deep learning Music This project studies various models for deep learning music generation, and investigates a novel approach to music generation that utilises image content as an input. We use Gemini, a large language model, to generate textual captions describing the image's content and emotional tone. These captions are then fed into the existing MusicGen framework, traditionally designed for text-based music generation. While our evaluation shows promise, with generated music thematically and emotionally aligned with the corresponding image, the melodic structure remains a challenge. This suggests potential limitations in using plain text captions as input for MusicGen. Our findings pave the way for further exploration of alternative representations that could directly translate image features into musical elements. This could involve delving into image processing techniques or developing specialised music generation models that handle image data more effectively. Overall, this project demonstrates the potential of image-based music generation and highlights the need for future research in this exciting area. Bachelor's degree 2024-04-23T23:31:19Z 2024-04-23T23:31:19Z 2024 Final Year Project (FYP) Chong, Y. A. (2024). Music generation with deep learning techniques. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/175381 https://hdl.handle.net/10356/175381 en application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Computer and Information Science Engineering AI Deep learning Music |
spellingShingle |
Computer and Information Science Engineering AI Deep learning Music Chong, Yi An Music generation with deep learning techniques |
description |
This project studies various models for deep learning music generation, and investigates a novel approach to music generation that utilises image content as an input. We use Gemini, a large language model, to generate textual captions describing the image's content and emotional tone. These captions are then fed into the existing MusicGen framework, traditionally designed for text-based music generation. While our evaluation shows promise, with generated music thematically and emotionally aligned with the corresponding image, the melodic structure remains a challenge. This suggests potential limitations in using plain text captions as input for MusicGen.
Our findings pave the way for further exploration of alternative representations that could directly translate image features into musical elements. This could involve delving into image processing techniques or developing specialised music generation models that handle image data more effectively. Overall, this project demonstrates the potential of image-based music generation and highlights the need for future research in this exciting area. |
author2 |
Alexei Sourin |
author_facet |
Alexei Sourin Chong, Yi An |
format |
Final Year Project |
author |
Chong, Yi An |
author_sort |
Chong, Yi An |
title |
Music generation with deep learning techniques |
title_short |
Music generation with deep learning techniques |
title_full |
Music generation with deep learning techniques |
title_fullStr |
Music generation with deep learning techniques |
title_full_unstemmed |
Music generation with deep learning techniques |
title_sort |
music generation with deep learning techniques |
publisher |
Nanyang Technological University |
publishDate |
2024 |
url |
https://hdl.handle.net/10356/175381 |
_version_ |
1800916120326111232 |