Multi-modal deception detection in videos

Deception detection has much significance as it has many real-world applications. This project focuses on the verbal and visual modalities for deception detection from videos. The experiments were conducted on the most widely used dataset: Real-Life Trial. In the project, text information, visual in...

Full description

Saved in:
Bibliographic Details
Main Author: Jin, Yibing
Other Authors: Alex Chichung Kot
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2022
Subjects:
Online Access:https://hdl.handle.net/10356/158188
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Deception detection has much significance as it has many real-world applications. This project focuses on the verbal and visual modalities for deception detection from videos. The experiments were conducted on the most widely used dataset: Real-Life Trial. In the project, text information, visual information, and multimodal cues were considered individually for detecting deception. Moreover, this project used both machine learning and deep learning methods to obtain the best performance on this task. For verbal feature extraction, TF-IDF, N-Grams, and LIWC were used to transfer the text into vectors. These vectors were processed by SVM, Naïve Bayes, Random Forest, and RNN. For visual feature extraction, facial action features and gaze direction features were extracted by OpenFace. The visual features are learned by a machine learning method, SVM. This paper also used a hybrid classification model based on CNN and GRU neural networks. For the multimodal machine learning method, the features from two different modalities were concatenated together after extraction, and then the extracted features were put into SVM for classification. The experimental results suggest that the hybrid model of CNN and GRU performs best among all the methods when only information from one modality is used, and using SVM outperforms other models when two modalities are used.