Multi modal video analysis with LLM for descriptive emotion and expression annotation

This project presents a novel approach to multi-modal emotion and action annotation by integrating facial expression recognition, action recognition, and audio-based emotion analysis into a unified framework. The system utilizes TimesFormer, OpenFace, and SpeechBrain to extract relevant features fro...

Full description

Saved in:

Bibliographic Details
Main Author:	Fan, Yupei
Other Authors:	Zheng Jianmin
Format:	Final Year Project
Language:	English
Published:	Nanyang Technological University 2024
Subjects:	Computer and Information Science Video understanding Large language model (LLM) Multimodal analysis Feature extraction Deep learning Emotion annotation
Online Access:	https://hdl.handle.net/10356/180715
Tags:	Add Tag No Tags, Be the first to tag this record!

Be the first to leave a comment!

Multi modal video analysis with LLM for descriptive emotion and expression annotation

Similar Items