AimigoTutor - tutoring application using multi-modal capabilities

Video captioning has been an up-and-coming research topic. Thanks to the recent advances in the performance of deep neural networks, especially with transformers, video captioning is seeing a huge potential improvement in accuracy and versatility. Most state-of-the-art video captioning models employ...

Full description

Saved in:

Bibliographic Details
Main Author:	Nguyen, Viet Hoang
Other Authors:	Hanwang Zhang
Format:	Final Year Project
Language:	English
Published:	Nanyang Technological University 2024
Subjects:	Computer and Information Science Multi-modal
Online Access:	https://hdl.handle.net/10356/175732
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-175732
record_format	dspace
spelling	sg-ntu-dr.10356-1757322024-05-10T15:40:40Z AimigoTutor - tutoring application using multi-modal capabilities Nguyen, Viet Hoang Hanwang Zhang School of Computer Science and Engineering hanwangzhang@ntu.edu.sg Computer and Information Science Multi-modal Video captioning has been an up-and-coming research topic. Thanks to the recent advances in the performance of deep neural networks, especially with transformers, video captioning is seeing a huge potential improvement in accuracy and versatility. Most state-of-the-art video captioning models employ a multi-modal approach, whereby both the visual information of the video frames and the audio information of the video are used to extract the semantic meaning of the video. This project will explore the capability of multi-modal video captioning in a much-needed context: building a video tutoring application for students, called AimigoTutor. This report will discuss the requirements, design, implementation and evaluation of the application. Bachelor's degree 2024-05-06T01:46:25Z 2024-05-06T01:46:25Z 2024 Final Year Project (FYP) Nguyen, V. H. (2024). AimigoTutor - tutoring application using multi-modal capabilities. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/175732 https://hdl.handle.net/10356/175732 en SCSE23-0209 application/pdf Nanyang Technological University
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Computer and Information Science Multi-modal
spellingShingle	Computer and Information Science Multi-modal Nguyen, Viet Hoang AimigoTutor - tutoring application using multi-modal capabilities
description	Video captioning has been an up-and-coming research topic. Thanks to the recent advances in the performance of deep neural networks, especially with transformers, video captioning is seeing a huge potential improvement in accuracy and versatility. Most state-of-the-art video captioning models employ a multi-modal approach, whereby both the visual information of the video frames and the audio information of the video are used to extract the semantic meaning of the video. This project will explore the capability of multi-modal video captioning in a much-needed context: building a video tutoring application for students, called AimigoTutor. This report will discuss the requirements, design, implementation and evaluation of the application.
author2	Hanwang Zhang
author_facet	Hanwang Zhang Nguyen, Viet Hoang
format	Final Year Project
author	Nguyen, Viet Hoang
author_sort	Nguyen, Viet Hoang
title	AimigoTutor - tutoring application using multi-modal capabilities
title_short	AimigoTutor - tutoring application using multi-modal capabilities
title_full	AimigoTutor - tutoring application using multi-modal capabilities
title_fullStr	AimigoTutor - tutoring application using multi-modal capabilities
title_full_unstemmed	AimigoTutor - tutoring application using multi-modal capabilities
title_sort	aimigotutor - tutoring application using multi-modal capabilities
publisher	Nanyang Technological University
publishDate	2024
url	https://hdl.handle.net/10356/175732
_version_	1800916227021864960

AimigoTutor - tutoring application using multi-modal capabilities

Similar Items