Multi modal video analysis with LLM for descriptive emotion and expression annotation
This project presents a novel approach to multi-modal emotion and action annotation by integrating facial expression recognition, action recognition, and audio-based emotion analysis into a unified framework. The system utilizes TimesFormer, OpenFace, and SpeechBrain to extract relevant features fro...
Saved in:
Main Author: | Fan, Yupei |
---|---|
Other Authors: | Zheng Jianmin |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/180715 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Similar Items
-
GEO-REFERENCED VIDEO RETRIEVAL: TEXT ANNOTATION AND SIMILARITY SEARCH
by: YIN YIFANG
Published: (2016) -
RELATION UNDERSTANDING IN VIDEOS
by: SHANG XINDI
Published: (2021) -
Large language model (LLM) with retrieve-augmented generation (RAG) for legal case research
by: Liu, Zihao
Published: (2024) -
In-video product annotation with web information mining
by: Li, G., et al.
Published: (2013) -
Annotating Objects and Relations in User-Generated Videos
by: Xindi Shang, et al.
Published: (2020)