Comparative analysis of YOLO and transformers for pedestrian detection

This report aims to study and compare the performance of two state-of-the-art real-time object detectors – YOLOv8 (You Only Look Once, 8th version) and RT-DETR (Real-Time Detection Transformers) in tackling pedestrian detection. Throughout the report, both models were trained and evaluated on differ...

Full description

Saved in:
Bibliographic Details
Main Author: Wong, Ying Xuan
Other Authors: Vidya Sudarshan
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/175269
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:This report aims to study and compare the performance of two state-of-the-art real-time object detectors – YOLOv8 (You Only Look Once, 8th version) and RT-DETR (Real-Time Detection Transformers) in tackling pedestrian detection. Throughout the report, both models were trained and evaluated on different pedestrian datasets, including TJU-DHD-Traffic, Caltech Pedestrian, KITTI, INRIA Person and Cityscapes. Besides, the performance of the integrated models between YOLOv8 and RT-DETR was also investigated. Thorough analyses were conducted, and it was concluded that YOLOv8 achieved a faster inference speed than RT-DETR regarding limited GPU resources. Besides, the integrated achieved comparable speed with YOLOv8, with accuracies comparable to or surpassing the RT-DETR models, highlighting the feasibility of integrating both detectors. Future work can include alternating integrated models to attain optimal results. Besides, tuning and experimenting on larger batch sizes shall also be included to conduct a more comprehensive comparison.