Scene understanding based on visual and text data
In this project, the concept of scene understanding with visual and text data will be applied to the task of video captioning, and its effectiveness would be evaluated. A new dataset of videos and its accompanying captions, which aims to be an improvement over current datasets, will be collected for...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
2019
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/77701 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-77701 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-777012023-07-07T17:49:36Z Scene understanding based on visual and text data Ong, Randal Ren Tai Mao Kezhi School of Electrical and Electronic Engineering DRNTU::Engineering::Electrical and electronic engineering::Computer hardware, software and systems In this project, the concept of scene understanding with visual and text data will be applied to the task of video captioning, and its effectiveness would be evaluated. A new dataset of videos and its accompanying captions, which aims to be an improvement over current datasets, will be collected for training and analysis. After which, it will be put through a baseline for training and analysis. The output metrics and captions will be observed and recorded to gauge if the numbers correlate with human judgement, and whether would they be accurate or not. The possible reasons for the accuracy will also be analysed and perhaps proposed for future work as well. Bachelor of Engineering (Electrical and Electronic Engineering) 2019-06-04T04:37:14Z 2019-06-04T04:37:14Z 2019 Final Year Project (FYP) http://hdl.handle.net/10356/77701 en Nanyang Technological University 32 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
DRNTU::Engineering::Electrical and electronic engineering::Computer hardware, software and systems |
spellingShingle |
DRNTU::Engineering::Electrical and electronic engineering::Computer hardware, software and systems Ong, Randal Ren Tai Scene understanding based on visual and text data |
description |
In this project, the concept of scene understanding with visual and text data will be applied to the task of video captioning, and its effectiveness would be evaluated. A new dataset of videos and its accompanying captions, which aims to be an improvement over current datasets, will be collected for training and analysis. After which, it will be put through a baseline for training and analysis. The output metrics and captions will be observed and recorded to gauge if the numbers correlate with human judgement, and whether would they be accurate or not. The possible reasons for the accuracy will also be analysed and perhaps proposed for future work as well. |
author2 |
Mao Kezhi |
author_facet |
Mao Kezhi Ong, Randal Ren Tai |
format |
Final Year Project |
author |
Ong, Randal Ren Tai |
author_sort |
Ong, Randal Ren Tai |
title |
Scene understanding based on visual and text data |
title_short |
Scene understanding based on visual and text data |
title_full |
Scene understanding based on visual and text data |
title_fullStr |
Scene understanding based on visual and text data |
title_full_unstemmed |
Scene understanding based on visual and text data |
title_sort |
scene understanding based on visual and text data |
publishDate |
2019 |
url |
http://hdl.handle.net/10356/77701 |
_version_ |
1772828881716248576 |