Neural image and video captioning (NIVC)

A common problem linking computer vision and natural language processing is the ability to generate accurate captioning for a given image. Researchers have spent decades trying to perfect the state of art image captioning. In this paper, various approaches of image captioning models towards ach...

全面介紹

Saved in:
書目詳細資料
主要作者: Lee, Jeremy Kian Kiat
其他作者: Zhang Hanwang
格式: Final Year Project
語言:English
出版: Nanyang Technological University 2022
主題:
在線閱讀:https://hdl.handle.net/10356/156511
標簽: 添加標簽
沒有標簽, 成為第一個標記此記錄!
實物特徵
總結:A common problem linking computer vision and natural language processing is the ability to generate accurate captioning for a given image. Researchers have spent decades trying to perfect the state of art image captioning. In this paper, various approaches of image captioning models towards achieving a state of the art results are studied. After the various approaches are studied, the best approaches are then extracted and then recombined into a new single model in hopes of achieving a new state of the art model. Furthermore, this paper proposes a sharing platform that allows users to apply the prediction model built as a real-world use case. Live captioning is proposed to utilize the inceptionV4 model to provide a description of an image. The platform comes in the form of a mobile application and is equipped with valuable functionalities to caption an image and share the inspiration on the free platform for different individuals to exchange their ideas