Visual relationship detection

Current scene graph generation (SGG) models struggle to achieve accurate and effective visual relationship detections between objects in images due to the existence of severely biased training datasets. For instance, biased SGG models often predict trivial and uninformative relationships such as...

Full description

Saved in:
Bibliographic Details
Main Author: Lee, Xavier Eugene
Other Authors: Hanwang Zhang
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/175285
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-175285
record_format dspace
spelling sg-ntu-dr.10356-1752852024-04-26T15:43:32Z Visual relationship detection Lee, Xavier Eugene Hanwang Zhang School of Computer Science and Engineering hanwangzhang@ntu.edu.sg Computer and Information Science Current scene graph generation (SGG) models struggle to achieve accurate and effective visual relationship detections between objects in images due to the existence of severely biased training datasets. For instance, biased SGG models often predict trivial and uninformative relationships such as “on” over more descriptive relationships like “running on” or “by” instead of “walking by”. Debiasing SGG, however, presents its own set of challenges as well due to the existence of long-tailed biases, bounded rationality, and language or reporting biases present during training. This paper presents a SGG framework with the novel Total Direct Effect (TDE) analysis within causal inference. The proposed framework is compared against a conventional causal effect framework: SGG framework with Total Effect (TE) analysis. While both frameworks construct factual causal graphs from traditional biased training, the TDE SGG models further apply counterfactual causality on the trained graphs to remove bad biases. After which, either TE or TDE is used to calculate and predict the predicates for their respective frameworks. In this paper, thorough analysis and evaluation have been conducted on the proposed SGG framework, concluding that the framework outperforms conventional SGG methods in object and relationship prediction accuracies across all the relationship retrieval tasks tested. As such, this research aims to contribute to the existing field of visual relationship detection with the proposed framework. Bachelor's degree 2024-04-22T08:28:03Z 2024-04-22T08:28:03Z 2024 Final Year Project (FYP) Lee, X. E. (2024). Visual relationship detection. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/175285 https://hdl.handle.net/10356/175285 en SCSE23-0213 application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Computer and Information Science
spellingShingle Computer and Information Science
Lee, Xavier Eugene
Visual relationship detection
description Current scene graph generation (SGG) models struggle to achieve accurate and effective visual relationship detections between objects in images due to the existence of severely biased training datasets. For instance, biased SGG models often predict trivial and uninformative relationships such as “on” over more descriptive relationships like “running on” or “by” instead of “walking by”. Debiasing SGG, however, presents its own set of challenges as well due to the existence of long-tailed biases, bounded rationality, and language or reporting biases present during training. This paper presents a SGG framework with the novel Total Direct Effect (TDE) analysis within causal inference. The proposed framework is compared against a conventional causal effect framework: SGG framework with Total Effect (TE) analysis. While both frameworks construct factual causal graphs from traditional biased training, the TDE SGG models further apply counterfactual causality on the trained graphs to remove bad biases. After which, either TE or TDE is used to calculate and predict the predicates for their respective frameworks. In this paper, thorough analysis and evaluation have been conducted on the proposed SGG framework, concluding that the framework outperforms conventional SGG methods in object and relationship prediction accuracies across all the relationship retrieval tasks tested. As such, this research aims to contribute to the existing field of visual relationship detection with the proposed framework.
author2 Hanwang Zhang
author_facet Hanwang Zhang
Lee, Xavier Eugene
format Final Year Project
author Lee, Xavier Eugene
author_sort Lee, Xavier Eugene
title Visual relationship detection
title_short Visual relationship detection
title_full Visual relationship detection
title_fullStr Visual relationship detection
title_full_unstemmed Visual relationship detection
title_sort visual relationship detection
publisher Nanyang Technological University
publishDate 2024
url https://hdl.handle.net/10356/175285
_version_ 1800916225109262336