Enhancing performance in video grounding tasks through the use of attention module

This report investigates improving video grounding tasks through the use of attention mechanisms, tackling the issue of sparse annotations in video datasets. Drawing inspiration from the MMN model \cite{wang2021_negative_2dmap}, we developed a modified model based on the open-source MMN codebase and...

Full description

Saved in:

Bibliographic Details
Main Author:	Do Duc Anh
Other Authors:	Sun Aixin
Format:	Final Year Project
Language:	English
Published:	Nanyang Technological University 2024
Subjects:	Computer and Information Science
Online Access:	https://hdl.handle.net/10356/181703
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-181703
record_format	dspace
spelling	sg-ntu-dr.10356-1817032024-12-16T01:26:13Z Enhancing performance in video grounding tasks through the use of attention module Do Duc Anh Sun Aixin College of Computing and Data Science AXSun@ntu.edu.sg Computer and Information Science This report investigates improving video grounding tasks through the use of attention mechanisms, tackling the issue of sparse annotations in video datasets. Drawing inspiration from the MMN model \cite{wang2021_negative_2dmap}, we developed a modified model based on the open-source MMN codebase and evaluated it on several widely-used datasets, including Charades-STA and ActivityNet Captions. Our approach shows improvements over certain benchmarks. Additionally, we conducted an in-depth analysis to assess the role of attention in enhancing the multimodal framework's ability to comprehend the complex structure of videos. Bachelor's degree 2024-12-16T01:26:13Z 2024-12-16T01:26:13Z 2024 Final Year Project (FYP) Do Duc Anh (2024). Enhancing performance in video grounding tasks through the use of attention module. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/181703 https://hdl.handle.net/10356/181703 en application/pdf Nanyang Technological University
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Computer and Information Science
spellingShingle	Computer and Information Science Do Duc Anh Enhancing performance in video grounding tasks through the use of attention module
description	This report investigates improving video grounding tasks through the use of attention mechanisms, tackling the issue of sparse annotations in video datasets. Drawing inspiration from the MMN model \cite{wang2021_negative_2dmap}, we developed a modified model based on the open-source MMN codebase and evaluated it on several widely-used datasets, including Charades-STA and ActivityNet Captions. Our approach shows improvements over certain benchmarks. Additionally, we conducted an in-depth analysis to assess the role of attention in enhancing the multimodal framework's ability to comprehend the complex structure of videos.
author2	Sun Aixin
author_facet	Sun Aixin Do Duc Anh
format	Final Year Project
author	Do Duc Anh
author_sort	Do Duc Anh
title	Enhancing performance in video grounding tasks through the use of attention module
title_short	Enhancing performance in video grounding tasks through the use of attention module
title_full	Enhancing performance in video grounding tasks through the use of attention module
title_fullStr	Enhancing performance in video grounding tasks through the use of attention module
title_full_unstemmed	Enhancing performance in video grounding tasks through the use of attention module
title_sort	enhancing performance in video grounding tasks through the use of attention module
publisher	Nanyang Technological University
publishDate	2024
url	https://hdl.handle.net/10356/181703
_version_	1819113015852662784

Enhancing performance in video grounding tasks through the use of attention module

Similar Items