Enhancing performance in video grounding tasks through the use of captions

Enhancing performance in video grounding tasks through the use of captions

This report explores enhancing video grounding tasks by utilizing generated captions, addressing the challenge posed by sparse annotations in video datasets. We took inspiration from the PCNet model which uses caption-guided attention to fuse the captions generated by Parallel Dynamic Video Captioni...

Full description

Saved in:

Bibliographic Details
Main Author:	Liu, Xinran
Other Authors:	Sun Aixin
Format:	Final Year Project
Language:	English
Published:	Nanyang Technological University 2024
Subjects:	Computer and Information Science Temporal sentence grounding Machine learning
Online Access:	https://hdl.handle.net/10356/175356
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

Similar Items

Poster: Towards efficient spatio-temporal video grounding in pervasive mobile devices
by: WEERAKOON MUDIYANSELAGE, Dulanga Kaveesha, et al.
Published: (2024)

Enhancing performance in video grounding tasks through the use of attention module
by: Do Duc Anh
Published: (2024)

STUDY OF GROUND SETTLEMENT INDUCED BY TUNNELLING AND DATA ENHANCEMENT FOR SETTLEMENT MONITORING USING LIDAR
by: HUANG LAN
Published: (2024)

A Fine-Grained Spatial-Temporal Attention Model for Video Captioning
by: Liu, A.-A., et al.
Published: (2021)

Neural image and video captioning
by: Lam, Ting En
Published: (2024)

Large language model enhanced with prompt-based vanilla distillation for sentence embeddings
by: Wang, Minghao
Published: (2024)

Generalization capacity of natural language video localization (NLVL) models
by: Dhanyamraju, Harsh Rao
Published: (2024)

Deep learning for video-grounded dialogue systems
by: LE, Hung
Published: (2022)

Enhanced supervisory control system design of an unmanned ground vehicle
by: Feng, X., et al.
Published: (2014)

Grounding referring expression in computer vision
by: Yuen, Shaun Chien Wee
Published: (2024)

Learning generalized video memory for automatic video captioning
by: CHANG, Poo-Hee, et al.
Published: (2018)

Language and robotics: Complex sentence understanding
by: HO, Seng-Beng, et al.
Published: (2019)

Mutual-reinforcement document summarization using embedded graph based sentence clustering for storytelling
by: Zhang, Z., et al.
Published: (2014)

I Know What You Want to Express: Sentence Element Inference by Incorporating External Knowledge Base
by: Xiaochi Wei, et al.
Published: (2020)

Using wireless technology to facilitate learning: A grounded theory approach
by: SHENG, Hong, et al.
Published: (2005)

Enabling and optimizing multi-modal sense-making for human-AI interaction tasks
by: WEERAKOON MUDIYANSELAGE, Dulanga Kaveesha Weerakoon
Published: (2024)

Scientific chart image recognition and interpretation
by: HUANG WEIHUA
Published: (2010)

Detection of tree defects via ground penetrating radar
by: Dang, Thanh Nhan
Published: (2024)

Ground effect on flow past a wing with a NACA0015 cross-section
by: Luo, S.C., et al.
Published: (2014)

Synthetic image generation and the use of virtual environments for image enhancement tasks
by: Del Gallego, Neil Patrick
Published: (2023)

Temporal consistent video editing using diffusion models
by: Bai, Shun Yao
Published: (2024)

Neural image and video captioning (NIVC)
by: Lee, Jeremy Kian Kiat
Published: (2022)

BlendCSE: blend contrastive learnings for sentence embeddings with rich semantics and transferability
by: Xu, Jiahao, et al.
Published: (2024)

Development of an enhanced ground control system for unmanned aerial vehicles
by: Liu, P., et al.
Published: (2014)

Thoughts on word and sentence segmentation in Thai
by: Wirote Aroonmanakun
Published: (2009)

Grounding referring expressions in images by variational context
by: Zhang, Hanwang, et al.
Published: (2020)

Measurement of different ground effect aircraft designs
by: Phan, Hector Jun Wen
Published: (2024)

Composition distillation for semantic sentence embeddings
by: Vaanavan, Sezhiyan
Published: (2024)

Indicative criteria of death sentencing
by: Piyapun Pingmuang, Pol.Col., 1961-
Published: (2023)

Song popularity prediction using machine learning
by: Feng, Zhilei
Published: (2024)

Computational aerodynamics and flight stability of Wing-In-Ground (WIG) craft
by: Wang, H., et al.
Published: (2014)

Classification of normal and malignant ventricular arrhythmia ECG rhythms using machine learning tools
by: Prabhakaran, Sahithya
Published: (2024)

Toward a grounded theory of game development work in the Philippines
by: Serrano, Elcid A., et al.
Published: (2018)

Cross-modal graph with meta concepts for video captioning
by: WANG, Hao, et al.
Published: (2022)

Child detection in videos using age estimation and convolutional neural networks
by: Ricafort, David John
Published: (2018)

Automatic closed caption generation from video files
by: Tan, Kenneth Chengwei
Published: (2014)

ENHANCING PRIVACY IN MACHINE LEARNING THROUGH THE MINIMIZATION OF MEMORIZATION
by: ZHENG ESTELLE
Published: (2024)

On true language understanding
by: HO, Seng-Beng, et al.
Published: (2019)

Becoming selfless for others: A grounded theory of commitment to service
by: Agoncillo, Roland Nino L.
Published: (2012)

Aggregating intrinsic information to enhance BCI performance through federated learning
by: Liu, Rui, et al.
Published: (2024)