Neural image and video captioning (NIVC)

A common problem linking computer vision and natural language processing is the ability to generate accurate captioning for a given image. Researchers have spent decades trying to perfect the state of art image captioning. In this paper, various approaches of image captioning models towards ach...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Lee, Jeremy Kian Kiat
مؤلفون آخرون: Zhang Hanwang
التنسيق: Final Year Project
اللغة:English
منشور في: Nanyang Technological University 2022
الموضوعات:
الوصول للمادة أونلاين:https://hdl.handle.net/10356/156511
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
المؤسسة: Nanyang Technological University
اللغة: English
الوصف
الملخص:A common problem linking computer vision and natural language processing is the ability to generate accurate captioning for a given image. Researchers have spent decades trying to perfect the state of art image captioning. In this paper, various approaches of image captioning models towards achieving a state of the art results are studied. After the various approaches are studied, the best approaches are then extracted and then recombined into a new single model in hopes of achieving a new state of the art model. Furthermore, this paper proposes a sharing platform that allows users to apply the prediction model built as a real-world use case. Live captioning is proposed to utilize the inceptionV4 model to provide a description of an image. The platform comes in the form of a mobile application and is equipped with valuable functionalities to caption an image and share the inspiration on the free platform for different individuals to exchange their ideas