Grounding referring expression in computer vision
This project studies the integration of language and vision in computer vision, focusing on Grounding Referring Expressions utilising the state-of-the-art GroundingDINO model. We address the topic of object identification and segmentation, emphasising zero-shot models’ ability to recognise items...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/174979 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-174979 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1749792024-04-19T15:46:39Z Grounding referring expression in computer vision Yuen, Shaun Chien Wee Hanwang Zhang School of Computer Science and Engineering hanwangzhang@ntu.edu.sg Computer and Information Science Computer vision Grounding Artificial intelligence This project studies the integration of language and vision in computer vision, focusing on Grounding Referring Expressions utilising the state-of-the-art GroundingDINO model. We address the topic of object identification and segmentation, emphasising zero-shot models’ ability to recognise items outside of their training sets. GroundingDINO, an improvement on the DINO model, is essential to our study, as it has significant capabilities in open-set object detection and natural language processing. The project aims to create a Proof of Concept Demo Application demonstrating GroundingDINO’s practical uses in improving human-computer interactions. Our literature review looks into the evolution of computer vision models and the revolutionary characteristics of GroundingDINO and finds gaps in current research, especially in dynamic situations like real-time video analysis. This contributes to the field by highlighting the potential of GroundingDINO in various industries, from surveillance to autonomous systems, and addresses the need for improved language-based object detection in computer vision. Bachelor's degree 2024-04-19T02:11:03Z 2024-04-19T02:11:03Z 2024 Final Year Project (FYP) Yuen, S. C. W. (2024). Grounding referring expression in computer vision. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/174979 https://hdl.handle.net/10356/174979 en SCSE23-0212 application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Computer and Information Science Computer vision Grounding Artificial intelligence |
spellingShingle |
Computer and Information Science Computer vision Grounding Artificial intelligence Yuen, Shaun Chien Wee Grounding referring expression in computer vision |
description |
This project studies the integration of language and vision in computer vision, focusing
on Grounding Referring Expressions utilising the state-of-the-art GroundingDINO
model. We address the topic of object identification and segmentation, emphasising
zero-shot models’ ability to recognise items outside of their training sets. GroundingDINO,
an improvement on the DINO model, is essential to our study, as it has significant
capabilities in open-set object detection and natural language processing. The project
aims to create a Proof of Concept Demo Application demonstrating GroundingDINO’s
practical uses in improving human-computer interactions. Our literature review looks
into the evolution of computer vision models and the revolutionary characteristics of
GroundingDINO and finds gaps in current research, especially in dynamic situations
like real-time video analysis. This contributes to the field by highlighting the potential
of GroundingDINO in various industries, from surveillance to autonomous systems,
and addresses the need for improved language-based object detection in computer
vision. |
author2 |
Hanwang Zhang |
author_facet |
Hanwang Zhang Yuen, Shaun Chien Wee |
format |
Final Year Project |
author |
Yuen, Shaun Chien Wee |
author_sort |
Yuen, Shaun Chien Wee |
title |
Grounding referring expression in computer vision |
title_short |
Grounding referring expression in computer vision |
title_full |
Grounding referring expression in computer vision |
title_fullStr |
Grounding referring expression in computer vision |
title_full_unstemmed |
Grounding referring expression in computer vision |
title_sort |
grounding referring expression in computer vision |
publisher |
Nanyang Technological University |
publishDate |
2024 |
url |
https://hdl.handle.net/10356/174979 |
_version_ |
1800916117888172032 |