Learning to compose and reason with language tree structures for visual grounding

Grounding natural language in images, such as localizing "the black dog on the left of the tree", is one of the core problems in artificial intelligence, as it needs to comprehend the fine-grained language compositions. However, existing solutions merely rely on the association between the...

Full description

Saved in:
Bibliographic Details
Main Authors: Hong, Richang, Liu, Daqing, Mo, Xiaoyu, He, Xiangnan, Zhang, Hanwang
Other Authors: School of Computer Science and Engineering
Format: Article
Language:English
Published: 2022
Subjects:
Online Access:https://hdl.handle.net/10356/162632
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English