Multi modal video analysis with LLM for descriptive emotion and expression annotation

This project presents a novel approach to multi-modal emotion and action annotation by integrating facial expression recognition, action recognition, and audio-based emotion analysis into a unified framework. The system utilizes TimesFormer, OpenFace, and SpeechBrain to extract relevant features fro...

全面介紹

Saved in:

書目詳細資料
主要作者:	Fan, Yupei
其他作者:	Zheng Jianmin
格式:	Final Year Project
語言:	English
出版:	Nanyang Technological University 2024
主題:	Computer and Information Science Video understanding Large language model (LLM) Multimodal analysis Feature extraction Deep learning Emotion annotation
在線閱讀:	https://hdl.handle.net/10356/180715
標簽:	添加標簽沒有標簽, 成為第一個標記此記錄!
機構:	Nanyang Technological University
語言:	English

因特網

https://hdl.handle.net/10356/180715

Multi modal video analysis with LLM for descriptive emotion and expression annotation

因特網

相似書籍