TRRNet : tiered relation reasoning for compositional visual question answering

Compositional visual question answering requires reasoning over both semantic and geometry object relations. We propose a novel tiered reasoning method that dynamically selects object level candidates based on language representations and generates robust pairwise relations within the selected candi...

Full description

Saved in:
Bibliographic Details
Main Authors: Yang, Xiaofeng, Lin, Guosheng, Lv, Fengmao, Liu, Fayao
Other Authors: School of Computer Science and Engineering
Format: Conference or Workshop Item
Language:English
Published: 2020
Subjects:
Online Access:https://hdl.handle.net/10356/144262
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-144262
record_format dspace
spelling sg-ntu-dr.10356-1442622020-10-26T01:24:47Z TRRNet : tiered relation reasoning for compositional visual question answering Yang, Xiaofeng Lin, Guosheng Lv, Fengmao Liu, Fayao School of Computer Science and Engineering European Conference on Computer Vision (ECCV) 2020 Engineering::Computer science and engineering Visual Question Answering Visual Reasoning Compositional visual question answering requires reasoning over both semantic and geometry object relations. We propose a novel tiered reasoning method that dynamically selects object level candidates based on language representations and generates robust pairwise relations within the selected candidate objects. The proposed tiered relation reasoning method can be compatible with the majority of the existing visual reasoning frameworks, leading to significant performance improvement with very little extra computational cost. Moreover, we propose a policy network that decides the appropriate reasoning steps based on question complexity and current reasoning status. In experiments, our model achieves state-of-the-art performance on two VQA datasets. AI Singapore Ministry of Education (MOE) National Research Foundation (NRF) Accepted version This research was supported by the National Research Foundation Singapore under its AI Singapore Programme (Award Number: AISG-RP-2018-003) and the MOE Tier-1 research grants: RG28/18 (S) and RG22/19 (S). F. Lv’s participation is supported by National Natural Science Foundation of China (No.11829101 and 11931014). Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not reflect the views of National Research Foundation, Singapore. 2020-10-26T01:24:47Z 2020-10-26T01:24:47Z 2020 Conference Paper Yang, X., Lin, G., Lv, F., & Liu, F. (2020). TRRNet : tiered relation reasoning for compositional visual question answering. European Conference on Computer Vision (ECCV) 2020. https://hdl.handle.net/10356/144262 en AISG-RP-2018-003 RG28/18 (S) RG22/19 (S) © 2020 Springer Nature Switzerland AG. All rights reserved. This paper was published in European Conference on Computer Vision (ECCV) 2020 and is made available with permission of Springer Nature Switzerland AG. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Computer science and engineering
Visual Question Answering
Visual Reasoning
spellingShingle Engineering::Computer science and engineering
Visual Question Answering
Visual Reasoning
Yang, Xiaofeng
Lin, Guosheng
Lv, Fengmao
Liu, Fayao
TRRNet : tiered relation reasoning for compositional visual question answering
description Compositional visual question answering requires reasoning over both semantic and geometry object relations. We propose a novel tiered reasoning method that dynamically selects object level candidates based on language representations and generates robust pairwise relations within the selected candidate objects. The proposed tiered relation reasoning method can be compatible with the majority of the existing visual reasoning frameworks, leading to significant performance improvement with very little extra computational cost. Moreover, we propose a policy network that decides the appropriate reasoning steps based on question complexity and current reasoning status. In experiments, our model achieves state-of-the-art performance on two VQA datasets.
author2 School of Computer Science and Engineering
author_facet School of Computer Science and Engineering
Yang, Xiaofeng
Lin, Guosheng
Lv, Fengmao
Liu, Fayao
format Conference or Workshop Item
author Yang, Xiaofeng
Lin, Guosheng
Lv, Fengmao
Liu, Fayao
author_sort Yang, Xiaofeng
title TRRNet : tiered relation reasoning for compositional visual question answering
title_short TRRNet : tiered relation reasoning for compositional visual question answering
title_full TRRNet : tiered relation reasoning for compositional visual question answering
title_fullStr TRRNet : tiered relation reasoning for compositional visual question answering
title_full_unstemmed TRRNet : tiered relation reasoning for compositional visual question answering
title_sort trrnet : tiered relation reasoning for compositional visual question answering
publishDate 2020
url https://hdl.handle.net/10356/144262
_version_ 1683492957886349312