Enhancing performance in video grounding tasks through the use of attention module
This report investigates improving video grounding tasks through the use of attention mechanisms, tackling the issue of sparse annotations in video datasets. Drawing inspiration from the MMN model \cite{wang2021_negative_2dmap}, we developed a modified model based on the open-source MMN codebase and...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/181703 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-181703 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1817032024-12-16T01:26:13Z Enhancing performance in video grounding tasks through the use of attention module Do Duc Anh Sun Aixin College of Computing and Data Science AXSun@ntu.edu.sg Computer and Information Science This report investigates improving video grounding tasks through the use of attention mechanisms, tackling the issue of sparse annotations in video datasets. Drawing inspiration from the MMN model \cite{wang2021_negative_2dmap}, we developed a modified model based on the open-source MMN codebase and evaluated it on several widely-used datasets, including Charades-STA and ActivityNet Captions. Our approach shows improvements over certain benchmarks. Additionally, we conducted an in-depth analysis to assess the role of attention in enhancing the multimodal framework's ability to comprehend the complex structure of videos. Bachelor's degree 2024-12-16T01:26:13Z 2024-12-16T01:26:13Z 2024 Final Year Project (FYP) Do Duc Anh (2024). Enhancing performance in video grounding tasks through the use of attention module. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/181703 https://hdl.handle.net/10356/181703 en application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Computer and Information Science |
spellingShingle |
Computer and Information Science Do Duc Anh Enhancing performance in video grounding tasks through the use of attention module |
description |
This report investigates improving video grounding tasks through the use of attention mechanisms, tackling the issue of sparse annotations in video datasets. Drawing inspiration from the MMN model \cite{wang2021_negative_2dmap}, we developed a modified model based on the open-source MMN codebase and evaluated it on several widely-used datasets, including Charades-STA and ActivityNet Captions. Our approach shows improvements over certain benchmarks. Additionally, we conducted an in-depth analysis to assess the role of attention in enhancing the multimodal framework's ability to comprehend the complex structure of videos. |
author2 |
Sun Aixin |
author_facet |
Sun Aixin Do Duc Anh |
format |
Final Year Project |
author |
Do Duc Anh |
author_sort |
Do Duc Anh |
title |
Enhancing performance in video grounding tasks through the use of attention module |
title_short |
Enhancing performance in video grounding tasks through the use of attention module |
title_full |
Enhancing performance in video grounding tasks through the use of attention module |
title_fullStr |
Enhancing performance in video grounding tasks through the use of attention module |
title_full_unstemmed |
Enhancing performance in video grounding tasks through the use of attention module |
title_sort |
enhancing performance in video grounding tasks through the use of attention module |
publisher |
Nanyang Technological University |
publishDate |
2024 |
url |
https://hdl.handle.net/10356/181703 |
_version_ |
1819113015852662784 |