Neural image and video captioning (NIVC)

A common problem linking computer vision and natural language processing is the ability to generate accurate captioning for a given image. Researchers have spent decades trying to perfect the state of art image captioning. In this paper, various approaches of image captioning models towards ach...

Full description

Saved in:
Bibliographic Details
Main Author: Lee, Jeremy Kian Kiat
Other Authors: Zhang Hanwang
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2022
Subjects:
Online Access:https://hdl.handle.net/10356/156511
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:A common problem linking computer vision and natural language processing is the ability to generate accurate captioning for a given image. Researchers have spent decades trying to perfect the state of art image captioning. In this paper, various approaches of image captioning models towards achieving a state of the art results are studied. After the various approaches are studied, the best approaches are then extracted and then recombined into a new single model in hopes of achieving a new state of the art model. Furthermore, this paper proposes a sharing platform that allows users to apply the prediction model built as a real-world use case. Live captioning is proposed to utilize the inceptionV4 model to provide a description of an image. The platform comes in the form of a mobile application and is equipped with valuable functionalities to caption an image and share the inspiration on the free platform for different individuals to exchange their ideas