Transformers as feature extractors in emotion-based music visualization
Cross-modal similarity learning evolves around the feature embeddings of the target modalities. With advancements in Deep Neural Network, feature extractions have seen an increasing sophistication. Convolutional Neural Networks (CNNs) and Residual Networks (ResNets) have proven to perform great...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/175170 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-175170 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1751702024-04-26T15:41:26Z Transformers as feature extractors in emotion-based music visualization Sim, Clodia Xin Ni Alexei Sourin School of Computer Science and Engineering assourin@ntu.edu.sg Computer and Information Science Cross-modal similarity learning evolves around the feature embeddings of the target modalities. With advancements in Deep Neural Network, feature extractions have seen an increasing sophistication. Convolutional Neural Networks (CNNs) and Residual Networks (ResNets) have proven to perform great feature extractions in the field of both computer vision and music analysis, both of which are crucial to music visualization. However, the emergence of transformers poses a question as to whether such networks are still the best choice for such tasks. This project will first explore existing works on music visualizations, and then study the use of emotion dimensions such as valence and arousal to quantify emotions. It also explores how audio signals and spectrograms can be used to analyse the emotions evoked by a piece of music. Ultimately, this project proposes to use transformers as feature extractors, and thereafter, leading to better music visualizations using cross-modal similarity learning. The experiments conducted proved that transformers perform better than state-of-the-art approaches. Bachelor's degree 2024-04-22T07:48:23Z 2024-04-22T07:48:23Z 2024 Final Year Project (FYP) Sim, C. X. N. (2024). Transformers as feature extractors in emotion-based music visualization. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/175170 https://hdl.handle.net/10356/175170 en application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Computer and Information Science |
spellingShingle |
Computer and Information Science Sim, Clodia Xin Ni Transformers as feature extractors in emotion-based music visualization |
description |
Cross-modal similarity learning evolves around the feature embeddings of the target
modalities. With advancements in Deep Neural Network, feature extractions have seen an
increasing sophistication. Convolutional Neural Networks (CNNs) and Residual Networks
(ResNets) have proven to perform great feature extractions in the field of both computer
vision and music analysis, both of which are crucial to music visualization. However, the
emergence of transformers poses a question as to whether such networks are still the best
choice for such tasks.
This project will first explore existing works on music visualizations, and then study the use of
emotion dimensions such as valence and arousal to quantify emotions. It also explores how
audio signals and spectrograms can be used to analyse the emotions evoked by a piece of
music. Ultimately, this project proposes to use transformers as feature extractors, and
thereafter, leading to better music visualizations using cross-modal similarity learning. The
experiments conducted proved that transformers perform better than state-of-the-art
approaches. |
author2 |
Alexei Sourin |
author_facet |
Alexei Sourin Sim, Clodia Xin Ni |
format |
Final Year Project |
author |
Sim, Clodia Xin Ni |
author_sort |
Sim, Clodia Xin Ni |
title |
Transformers as feature extractors in emotion-based music visualization |
title_short |
Transformers as feature extractors in emotion-based music visualization |
title_full |
Transformers as feature extractors in emotion-based music visualization |
title_fullStr |
Transformers as feature extractors in emotion-based music visualization |
title_full_unstemmed |
Transformers as feature extractors in emotion-based music visualization |
title_sort |
transformers as feature extractors in emotion-based music visualization |
publisher |
Nanyang Technological University |
publishDate |
2024 |
url |
https://hdl.handle.net/10356/175170 |
_version_ |
1800916260861509632 |