Instance LSeg - exploring instance level information from visual language model

This final year project explores the potential of using large-scale pretrained visual language models in instance-level zero-shot computer vision tasks. Specifically, we propose Instance LSeg - a novel approach to extend the zero-shot semantic segmentation method LSeg to perform language guided...

Full description

Saved in:
Bibliographic Details
Main Author: Lin, Zixing
Other Authors: Lin Guosheng
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2023
Subjects:
Online Access:https://hdl.handle.net/10356/171917
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:This final year project explores the potential of using large-scale pretrained visual language models in instance-level zero-shot computer vision tasks. Specifically, we propose Instance LSeg - a novel approach to extend the zero-shot semantic segmentation method LSeg to perform language guided instance segmentation and grounding of natural language expressions in images. To evaluate our method, we used three popular referring datasets, and we observe that our method achieves highly competitive results against published generalized visual grounding baselines