Towards unbiased visual emotion recognition via causal intervention

Although much progress has been made in visual emotion recognition, researchers have realized that modern deep networks tend to exploit dataset characteristics to learn spurious statistical associations between the input and the target. Such dataset characteristics are usually treated as dataset bia...

Full description

Saved in:
Bibliographic Details
Main Authors: Chen, Yuedong, Yang, Xu, Cham, Tat-Jen, Cai, Jianfei
Other Authors: School of Computer Science and Engineering
Format: Conference or Workshop Item
Language:English
Published: 2023
Subjects:
Online Access:https://hdl.handle.net/10356/172660
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-172660
record_format dspace
spelling sg-ntu-dr.10356-1726602023-12-19T05:10:38Z Towards unbiased visual emotion recognition via causal intervention Chen, Yuedong Yang, Xu Cham, Tat-Jen Cai, Jianfei School of Computer Science and Engineering 30th ACM International Conference on Multimedia (MM 2022) Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision Causal Intervention Backdoor Adjustment Although much progress has been made in visual emotion recognition, researchers have realized that modern deep networks tend to exploit dataset characteristics to learn spurious statistical associations between the input and the target. Such dataset characteristics are usually treated as dataset bias, which damages the robustness and generalization performance of these recognition systems. In this work, we scrutinize this problem from the perspective of causal inference, where such dataset characteristic is termed as a confounder which misleads the system to learn the spurious correlation. To alleviate the negative effects brought by the dataset bias, we propose a novel Interventional Emotion Recognition Network (IERN) to achieve the backdoor adjustment, which is one fundamental deconfounding technique in causal inference. Specifically, IERN starts by disentangling the dataset-related context feature from the actual emotion feature, where the former forms the confounder. The emotion feature will then be forced to see each confounder stratum equally before being fed into the classifier. A series of designed tests validate the efficacy of IERN, and experiments on three emotion benchmarks demonstrate that IERN outperforms state-of-the-art approaches for unbiased visual emotion recognition. This study is supported under the RIE2020 Industry Alignment Fund – Industry Collaboration Projects (IAF-ICP) Funding Initiative, as well as cash and in-kind contribution from Singapore Telecommunications Limited (Singtel), through Singtel Cognitive and Artificial Intelligence Lab for Enterprises (SCALE@NTU). This research is also partially supported by FIT Start-up Grant. 2023-12-19T05:10:38Z 2023-12-19T05:10:38Z 2022 Conference Paper Chen, Y., Yang, X., Cham, T. & Cai, J. (2022). Towards unbiased visual emotion recognition via causal intervention. 30th ACM International Conference on Multimedia (MM 2022), October 2022, 60-69. https://dx.doi.org/10.1145/3503161.3547936 9781450392037 https://hdl.handle.net/10356/172660 10.1145/3503161.3547936 2-s2.0-85148333769 October 2022 60 69 en IAF-ICP © 2022 Association for Computing Machinery. All rights reserved.
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
Causal Intervention
Backdoor Adjustment
spellingShingle Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
Causal Intervention
Backdoor Adjustment
Chen, Yuedong
Yang, Xu
Cham, Tat-Jen
Cai, Jianfei
Towards unbiased visual emotion recognition via causal intervention
description Although much progress has been made in visual emotion recognition, researchers have realized that modern deep networks tend to exploit dataset characteristics to learn spurious statistical associations between the input and the target. Such dataset characteristics are usually treated as dataset bias, which damages the robustness and generalization performance of these recognition systems. In this work, we scrutinize this problem from the perspective of causal inference, where such dataset characteristic is termed as a confounder which misleads the system to learn the spurious correlation. To alleviate the negative effects brought by the dataset bias, we propose a novel Interventional Emotion Recognition Network (IERN) to achieve the backdoor adjustment, which is one fundamental deconfounding technique in causal inference. Specifically, IERN starts by disentangling the dataset-related context feature from the actual emotion feature, where the former forms the confounder. The emotion feature will then be forced to see each confounder stratum equally before being fed into the classifier. A series of designed tests validate the efficacy of IERN, and experiments on three emotion benchmarks demonstrate that IERN outperforms state-of-the-art approaches for unbiased visual emotion recognition.
author2 School of Computer Science and Engineering
author_facet School of Computer Science and Engineering
Chen, Yuedong
Yang, Xu
Cham, Tat-Jen
Cai, Jianfei
format Conference or Workshop Item
author Chen, Yuedong
Yang, Xu
Cham, Tat-Jen
Cai, Jianfei
author_sort Chen, Yuedong
title Towards unbiased visual emotion recognition via causal intervention
title_short Towards unbiased visual emotion recognition via causal intervention
title_full Towards unbiased visual emotion recognition via causal intervention
title_fullStr Towards unbiased visual emotion recognition via causal intervention
title_full_unstemmed Towards unbiased visual emotion recognition via causal intervention
title_sort towards unbiased visual emotion recognition via causal intervention
publishDate 2023
url https://hdl.handle.net/10356/172660
_version_ 1787136722810699776