EEG-video emotion-based summarization: Learning with EEG auxiliary signals

Video summarization is the process of selecting a subset of informative keyframes to expedite storytelling with limited loss of information. In this paper, we propose an EEG-Video Emotion-based Summarization (EVES) model based on a multimodal deep reinforcement learning (DRL) architecture that lever...

Full description

Saved in:
Bibliographic Details
Main Authors: LEW, Wai-Cheong L., WANG, Di, ANG, Kai-Keng, LIM, Joo-Hwee, QUEK, Chai, TAN, Ah-hwee
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2022
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/7567
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-8570
record_format dspace
spelling sg-smu-ink.sis_research-85702022-11-29T06:48:02Z EEG-video emotion-based summarization: Learning with EEG auxiliary signals LEW, Wai-Cheong L. WANG, Di ANG, Kai-Keng LIM, Joo-Hwee QUEK, Chai TAN, Ah-hwee Video summarization is the process of selecting a subset of informative keyframes to expedite storytelling with limited loss of information. In this paper, we propose an EEG-Video Emotion-based Summarization (EVES) model based on a multimodal deep reinforcement learning (DRL) architecture that leverages neural signals to learn visual interestingness to produce quantitatively and qualitatively better video summaries. As such, EVES does not learn from the expensive human annotations but the multimodal signals. Furthermore, to ensure the temporal alignment and minimize the modality gap between the visual and EEG modalities, we introduce a Time Synchronization Module (TSM) that uses an attention mechanism to transform the EEG representations onto the visual representation space. We evaluate the performance of EVES on the TVSum and SumMe datasets. Based on the rank order statistics benchmarks, the experimental results show that EVES outperforms the unsupervised models and narrows the performance gap with supervised models. Furthermore, the human evaluation scores show that EVES receives a higher rating than the state-of-the-art DRL model DR-DSN by 11.4% on the coherency of the content and 7.4% on the emotion-evoking content. Thus, our work demonstrates the potential of EVES in selecting interesting content that is both coherent and emotion-evoking. 2022-09-21T07:00:00Z text https://ink.library.smu.edu.sg/sis_research/7567 info:doi/10.1109/TAFFC.2022.3208259 Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Databases and Information Systems Graphics and Human Computer Interfaces
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Databases and Information Systems
Graphics and Human Computer Interfaces
spellingShingle Databases and Information Systems
Graphics and Human Computer Interfaces
LEW, Wai-Cheong L.
WANG, Di
ANG, Kai-Keng
LIM, Joo-Hwee
QUEK, Chai
TAN, Ah-hwee
EEG-video emotion-based summarization: Learning with EEG auxiliary signals
description Video summarization is the process of selecting a subset of informative keyframes to expedite storytelling with limited loss of information. In this paper, we propose an EEG-Video Emotion-based Summarization (EVES) model based on a multimodal deep reinforcement learning (DRL) architecture that leverages neural signals to learn visual interestingness to produce quantitatively and qualitatively better video summaries. As such, EVES does not learn from the expensive human annotations but the multimodal signals. Furthermore, to ensure the temporal alignment and minimize the modality gap between the visual and EEG modalities, we introduce a Time Synchronization Module (TSM) that uses an attention mechanism to transform the EEG representations onto the visual representation space. We evaluate the performance of EVES on the TVSum and SumMe datasets. Based on the rank order statistics benchmarks, the experimental results show that EVES outperforms the unsupervised models and narrows the performance gap with supervised models. Furthermore, the human evaluation scores show that EVES receives a higher rating than the state-of-the-art DRL model DR-DSN by 11.4% on the coherency of the content and 7.4% on the emotion-evoking content. Thus, our work demonstrates the potential of EVES in selecting interesting content that is both coherent and emotion-evoking.
format text
author LEW, Wai-Cheong L.
WANG, Di
ANG, Kai-Keng
LIM, Joo-Hwee
QUEK, Chai
TAN, Ah-hwee
author_facet LEW, Wai-Cheong L.
WANG, Di
ANG, Kai-Keng
LIM, Joo-Hwee
QUEK, Chai
TAN, Ah-hwee
author_sort LEW, Wai-Cheong L.
title EEG-video emotion-based summarization: Learning with EEG auxiliary signals
title_short EEG-video emotion-based summarization: Learning with EEG auxiliary signals
title_full EEG-video emotion-based summarization: Learning with EEG auxiliary signals
title_fullStr EEG-video emotion-based summarization: Learning with EEG auxiliary signals
title_full_unstemmed EEG-video emotion-based summarization: Learning with EEG auxiliary signals
title_sort eeg-video emotion-based summarization: learning with eeg auxiliary signals
publisher Institutional Knowledge at Singapore Management University
publishDate 2022
url https://ink.library.smu.edu.sg/sis_research/7567
_version_ 1770576374151512064