Enhancing performance in video grounding tasks through the use of captions
This report explores enhancing video grounding tasks by utilizing generated captions, addressing the challenge posed by sparse annotations in video datasets. We took inspiration from the PCNet model which uses caption-guided attention to fuse the captions generated by Parallel Dynamic Video Captioni...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/175356 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Be the first to leave a comment!