Scene understanding based on heterogeneous data fusion

Solving visual translating problem has always been the major task of artificial intelligent. The problem has become advancing with the significant progress by static image understanding by deep neural network. (H. X. Subhashini Venugopalan 2015) When moving to dynamic scene such as video data, the i...

Full description

Saved in:
Bibliographic Details
Main Author: Ren, Haosu
Other Authors: Mao Kezhi
Format: Final Year Project
Language:English
Published: 2018
Subjects:
Online Access:http://hdl.handle.net/10356/75215
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-75215
record_format dspace
spelling sg-ntu-dr.10356-752152023-07-07T16:08:39Z Scene understanding based on heterogeneous data fusion Ren, Haosu Mao Kezhi School of Electrical and Electronic Engineering DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Solving visual translating problem has always been the major task of artificial intelligent. The problem has become advancing with the significant progress by static image understanding by deep neural network. (H. X. Subhashini Venugopalan 2015) When moving to dynamic scene such as video data, the information is enriched with not only static images but also temporal motions and acoustic signals. And an effective video scene understanding will help audition for today’s massive video updating activity. Therefore, how to extract and fuse these heterogeneous data became a new challenge to help machine understand the scene. In this project, we implemented the classical video caption network structure and discussed various approaches to fuse heterogeneous data aiming to generate a comprehensive sentence to describe a video. In the end, we compared different fusion methods on their decretive sentences to videos. Bachelor of Engineering 2018-05-30T03:55:20Z 2018-05-30T03:55:20Z 2018 Final Year Project (FYP) http://hdl.handle.net/10356/75215 en Nanyang Technological University 53 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
spellingShingle DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Ren, Haosu
Scene understanding based on heterogeneous data fusion
description Solving visual translating problem has always been the major task of artificial intelligent. The problem has become advancing with the significant progress by static image understanding by deep neural network. (H. X. Subhashini Venugopalan 2015) When moving to dynamic scene such as video data, the information is enriched with not only static images but also temporal motions and acoustic signals. And an effective video scene understanding will help audition for today’s massive video updating activity. Therefore, how to extract and fuse these heterogeneous data became a new challenge to help machine understand the scene. In this project, we implemented the classical video caption network structure and discussed various approaches to fuse heterogeneous data aiming to generate a comprehensive sentence to describe a video. In the end, we compared different fusion methods on their decretive sentences to videos.
author2 Mao Kezhi
author_facet Mao Kezhi
Ren, Haosu
format Final Year Project
author Ren, Haosu
author_sort Ren, Haosu
title Scene understanding based on heterogeneous data fusion
title_short Scene understanding based on heterogeneous data fusion
title_full Scene understanding based on heterogeneous data fusion
title_fullStr Scene understanding based on heterogeneous data fusion
title_full_unstemmed Scene understanding based on heterogeneous data fusion
title_sort scene understanding based on heterogeneous data fusion
publishDate 2018
url http://hdl.handle.net/10356/75215
_version_ 1772826988401131520