Scene understanding based on visual and text data

In this project, the concept of scene understanding with visual and text data will be applied to the task of video captioning, and its effectiveness would be evaluated. A new dataset of videos and its accompanying captions, which aims to be an improvement over current datasets, will be collected for...

Full description

Saved in:

Bibliographic Details
Main Author:	Ong, Randal Ren Tai
Other Authors:	Mao Kezhi
Format:	Final Year Project
Language:	English
Published:	2019
Subjects:	DRNTU::Engineering::Electrical and electronic engineering::Computer hardware, software and systems
Online Access:	http://hdl.handle.net/10356/77701
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

Description
Summary:	In this project, the concept of scene understanding with visual and text data will be applied to the task of video captioning, and its effectiveness would be evaluated. A new dataset of videos and its accompanying captions, which aims to be an improvement over current datasets, will be collected for training and analysis. After which, it will be put through a baseline for training and analysis. The output metrics and captions will be observed and recorded to gauge if the numbers correlate with human judgement, and whether would they be accurate or not. The possible reasons for the accuracy will also be analysed and perhaps proposed for future work as well.

Scene understanding based on visual and text data

Similar Items