Neural image and video captioning

In today’s digital age, the proliferation of visual content has underscored the critical importance of multimedia comprehension and interpretation. Video uses images and sound to convey information. This project introduces a novel approach to video captioning, leveraging the synergies between Machin...

Full description

Saved in:

Bibliographic Details
Main Author:	Lam, Ting En
Other Authors:	Hanwang Zhang
Format:	Final Year Project
Language:	English
Published:	Nanyang Technological University 2024
Subjects:	Computer and Information Science
Online Access:	https://hdl.handle.net/10356/175286
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-175286
record_format	dspace
spelling	sg-ntu-dr.10356-1752862024-04-26T15:43:34Z Neural image and video captioning Lam, Ting En Hanwang Zhang School of Computer Science and Engineering hanwangzhang@ntu.edu.sg Computer and Information Science In today’s digital age, the proliferation of visual content has underscored the critical importance of multimedia comprehension and interpretation. Video uses images and sound to convey information. This project introduces a novel approach to video captioning, leveraging the synergies between Machine Learning, Computer Vision and Natural Language Processing to bridge the gap between human and computer understanding of visual understanding by generating descriptive captions from visual content. In this project, the effectiveness of various image captioning models is evaluated to identify optimal frameworks for textual description generation. Subsequently, a video captioning model capable of generating multimodal captions for video content is developed. The proposed image and video captioning models are evaluated using standard metrics and a human evaluation study was conducted. Additionally, the models are deployed into a user-friendly application for usage. Overall, this study seeks to improve video captioning performance and foster further advancements in this field. Bachelor's degree 2024-04-22T08:35:17Z 2024-04-22T08:35:17Z 2024 Final Year Project (FYP) Lam, T. E. (2024). Neural image and video captioning. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/175286 https://hdl.handle.net/10356/175286 en SCSE23-0211 application/pdf Nanyang Technological University
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Computer and Information Science
spellingShingle	Computer and Information Science Lam, Ting En Neural image and video captioning
description	In today’s digital age, the proliferation of visual content has underscored the critical importance of multimedia comprehension and interpretation. Video uses images and sound to convey information. This project introduces a novel approach to video captioning, leveraging the synergies between Machine Learning, Computer Vision and Natural Language Processing to bridge the gap between human and computer understanding of visual understanding by generating descriptive captions from visual content. In this project, the effectiveness of various image captioning models is evaluated to identify optimal frameworks for textual description generation. Subsequently, a video captioning model capable of generating multimodal captions for video content is developed. The proposed image and video captioning models are evaluated using standard metrics and a human evaluation study was conducted. Additionally, the models are deployed into a user-friendly application for usage. Overall, this study seeks to improve video captioning performance and foster further advancements in this field.
author2	Hanwang Zhang
author_facet	Hanwang Zhang Lam, Ting En
format	Final Year Project
author	Lam, Ting En
author_sort	Lam, Ting En
title	Neural image and video captioning
title_short	Neural image and video captioning
title_full	Neural image and video captioning
title_fullStr	Neural image and video captioning
title_full_unstemmed	Neural image and video captioning
title_sort	neural image and video captioning
publisher	Nanyang Technological University
publishDate	2024
url	https://hdl.handle.net/10356/175286
_version_	1814047055103918080

Neural image and video captioning

Similar Items