OW-Mamba: Mamba for open world object detection
Object detection is a fundamental task in computer vision, and recently, a more challenging variant known as open-world object detection has gained attention. This task involves not only identifying novel, unknown objects but also incrementally learning to classify...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Master by Coursework |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/181673 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-181673 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1816732024-12-13T15:47:37Z OW-Mamba: Mamba for open world object detection Sun, Heyuan Yap Kim Hui School of Electrical and Electronic Engineering EKHYap@ntu.edu.sg Engineering Open world object detection Object detection is a fundamental task in computer vision, and recently, a more challenging variant known as open-world object detection has gained attention. This task involves not only identifying novel, unknown objects but also incrementally learning to classify them as labels become available. Two notable approaches, Open World Detection Transformer (OWDETR) and Localization and Identification Cascade Detection Transformer (CAT), have been proposed to address this challenge. However, these methods are prone to generating false unknown objects and are computationally expensive, especially with high-resolution images. Additionally, there is significant room for improvement in detecting novel objects. To overcome these limitations, we propose OW-Mamba, an enhanced approach based on CAT. Specifically, we replace the ResNet-50 backbone in CAT with VMamba-T and introduce a dual-stream decoder, which improves both localization and classification. Furthermore, we refine the pseudo-labeling process to reduce the generation of false positives. Extensive experiments show that OW- Mamba outperforms CAT in Tasks 1, 3, and 4, while also significantly reducing the time and GPU memory required. Master's degree 2024-12-12T23:01:25Z 2024-12-12T23:01:25Z 2024 Thesis-Master by Coursework Sun, H. (2024). OW-Mamba: Mamba for open world object detection. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/181673 https://hdl.handle.net/10356/181673 en application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering Open world object detection |
spellingShingle |
Engineering Open world object detection Sun, Heyuan OW-Mamba: Mamba for open world object detection |
description |
Object detection is a fundamental task in computer vision, and recently, a more challenging variant known as open-world object detection has gained attention. This task involves not only identifying novel, unknown objects but also incrementally learning to classify them as labels become available. Two notable approaches, Open World Detection Transformer (OWDETR) and Localization and Identification Cascade Detection Transformer (CAT), have been proposed to address this challenge. However, these methods are prone to generating false unknown objects and are computationally expensive, especially with high-resolution images. Additionally, there is significant room for improvement in detecting novel objects.
To overcome these limitations, we propose OW-Mamba, an enhanced approach based on CAT. Specifically, we replace the ResNet-50 backbone in CAT with VMamba-T and introduce a dual-stream decoder, which improves both localization and classification. Furthermore, we refine the pseudo-labeling process to reduce the generation of false positives. Extensive experiments show that OW- Mamba outperforms CAT in Tasks 1, 3, and 4, while also significantly reducing the time and GPU memory required. |
author2 |
Yap Kim Hui |
author_facet |
Yap Kim Hui Sun, Heyuan |
format |
Thesis-Master by Coursework |
author |
Sun, Heyuan |
author_sort |
Sun, Heyuan |
title |
OW-Mamba: Mamba for open world object detection |
title_short |
OW-Mamba: Mamba for open world object detection |
title_full |
OW-Mamba: Mamba for open world object detection |
title_fullStr |
OW-Mamba: Mamba for open world object detection |
title_full_unstemmed |
OW-Mamba: Mamba for open world object detection |
title_sort |
ow-mamba: mamba for open world object detection |
publisher |
Nanyang Technological University |
publishDate |
2024 |
url |
https://hdl.handle.net/10356/181673 |
_version_ |
1819112986778796032 |