From an image to a text description of an image

This project presents an implementation of a search feature that allows user to look for a particular object of interest in a video. The main idea is to train a very deep neural network architecture that outputs a sequence of words that describe an image. The network consists of a convolutional neur...

Full description

Saved in:

Bibliographic Details
Main Author:	Peter
Other Authors:	Chng Eng Siong
Format:	Final Year Project
Language:	English
Published:	2017
Subjects:	DRNTU::Engineering::Computer science and engineering
Online Access:	http://hdl.handle.net/10356/70223
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-70223
record_format	dspace
spelling	sg-ntu-dr.10356-702232023-03-03T20:27:54Z From an image to a text description of an image Peter Chng Eng Siong School of Computer Science and Engineering DRNTU::Engineering::Computer science and engineering This project presents an implementation of a search feature that allows user to look for a particular object of interest in a video. The main idea is to train a very deep neural network architecture that outputs a sequence of words that describe an image. The network consists of a convolutional neural network (CNN) that learns features found on an image, and a long short-term memory (LSTM) unit that predicts the sequence of words from learnt features of the image. This project is not about real-time object detection, instead a video has to be preprocessed before a user may search for an object found visually inside the video. Bachelor of Engineering (Computer Science) 2017-04-17T07:22:49Z 2017-04-17T07:22:49Z 2017 Final Year Project (FYP) http://hdl.handle.net/10356/70223 en Nanyang Technological University 59 p. application/pdf
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	DRNTU::Engineering::Computer science and engineering
spellingShingle	DRNTU::Engineering::Computer science and engineering Peter From an image to a text description of an image
description	This project presents an implementation of a search feature that allows user to look for a particular object of interest in a video. The main idea is to train a very deep neural network architecture that outputs a sequence of words that describe an image. The network consists of a convolutional neural network (CNN) that learns features found on an image, and a long short-term memory (LSTM) unit that predicts the sequence of words from learnt features of the image. This project is not about real-time object detection, instead a video has to be preprocessed before a user may search for an object found visually inside the video.
author2	Chng Eng Siong
author_facet	Chng Eng Siong Peter
format	Final Year Project
author	Peter
author_sort	Peter
title	From an image to a text description of an image
title_short	From an image to a text description of an image
title_full	From an image to a text description of an image
title_fullStr	From an image to a text description of an image
title_full_unstemmed	From an image to a text description of an image
title_sort	from an image to a text description of an image
publishDate	2017
url	http://hdl.handle.net/10356/70223
_version_	1759858272626540544

From an image to a text description of an image

Similar Items