A benchmark of CNN backbones on DINO-DETR performance in object detection

Recent developments in DETR-based models have made significant improvements in training convergence but not small object detection. This paper combines the ConvNeXt and FocalNet backbones with DINO-DETR using timm and detrex, and presents a benchmark and analysis of the resulting model performances...

Full description

Saved in:

Bibliographic Details
Main Author:	Liew, Zon Hur Zhen
Other Authors:	Lu Shijian
Format:	Final Year Project
Language:	English
Published:	Nanyang Technological University 2023
Subjects:	Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
Online Access:	https://hdl.handle.net/10356/172020
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

Description
Summary:	Recent developments in DETR-based models have made significant improvements in training convergence but not small object detection. This paper combines the ConvNeXt and FocalNet backbones with DINO-DETR using timm and detrex, and presents a benchmark and analysis of the resulting model performances on MS-COCO and SODA-D. The results affirm many conclusions from the ConvNeXt and FocalNet papers while exhibiting inconsistencies for FocalNets on SODA-D. Finally, the results show encouraging performance for DINO-DETR with recent backbones on general object detection and the need for further improvement on small object detection with DINO-DETR across all backbones. Further efforts should be made to integrate state-of-the-art features from concurrent developments to produce new benchmarks on small object detection datasets with accessible existing technology.

A benchmark of CNN backbones on DINO-DETR performance in object detection

Similar Items