Scene graph extraction from images

An image contains a lot of information, and that information can be used in high-level complex systems for operations such as Computer Vision tasks. Most Computer Vision tasks, such as Image Classification and Object Detection, only require outputting an image-level prediction or the localization of...

Full description

Saved in:
Bibliographic Details
Main Author: Ng, Felix Zhen Feng
Other Authors: Liu Ziwei
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2022
Subjects:
Online Access:https://hdl.handle.net/10356/156443
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:An image contains a lot of information, and that information can be used in high-level complex systems for operations such as Computer Vision tasks. Most Computer Vision tasks, such as Image Classification and Object Detection, only require outputting an image-level prediction or the localization of objects in the image. However, it is still not sufficient for a comprehensive interpretation of all the information in an image. To deliver all the information within an image, a generated Scene Graph can be used. A Scene Graph is a structured representation of a scene that clearly express the objects and their attributes in the form of nodes, and relationships between objects in the form of edges, so that a graph structure can be built. This project aims to understand Scene Graph Generation, explore several classic methodologies by evaluating and comparing the correctness of predicted scene graph models, and find the key factors that affect the correctness of scene graphs. Many insights had been discovered in this project, for example, prior knowledge (which can be interpreted as common sense), can greatly affect the performance of Scene Graph Generation. Additionally, it was observed that models with a better backbone generated a more accurate Scene Graph. Beyond the exploration of methodologies, a software was developed to process photos captured from a connected webcam into a Scene Graph.