Enhanced multi-task learning architecture for detecting pedestrian at far distance

Existing pedestrian detection methods suffer from performance degradation in the presence of small-scale pedestrians who are positioned at far distance from the camera. We present a pedestrian detection framework that is not only robust to small- and large-scale pedestrians, but is also significantl...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhou, Chengju, Wu, Meiqing, Lam, Siew-Kei
Other Authors: College of Computing and Data Science
Format: Article
Language:English
Published: 2024
Subjects:
Online Access:https://hdl.handle.net/10356/178581
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-178581
record_format dspace
spelling sg-ntu-dr.10356-1785812024-06-28T00:38:36Z Enhanced multi-task learning architecture for detecting pedestrian at far distance Zhou, Chengju Wu, Meiqing Lam, Siew-Kei College of Computing and Data Science School of Computer Science and Engineering Computer and Information Science Multi-task learning Pedestrian detection Existing pedestrian detection methods suffer from performance degradation in the presence of small-scale pedestrians who are positioned at far distance from the camera. We present a pedestrian detection framework that is not only robust to small- and large-scale pedestrians, but is also significantly faster than state-of-the-art methods. The proposed framework incorporates semantic segmentation to confidence modules for RPN (Region Proposal Network) head and R-FCN (Region-based Fully Convolutional Networks) head, and a cascaded R-FCN head. The semantic segmentation confidence is extracted and utilized as auxiliary classification prior knowledge for RPN proposal selection and R-FCN head prediction. Finally, the cascaded R-FCN head progressively refine the pedestrian prediction accuracy with negligible computation overhead. The proposed framework is also capable of maintaining high detection performance on down-sampled input images, which leads to further reduction in overall computational complexity. Experiment results on CityPersons and MOT17Det datasets show that the proposed framework achieves competitive detection performance with about 3× speedup over state-of-the-art methods. Ministry of Education (MOE) National Research Foundation (NRF) This work was supported in part by the National Research Foundation Singapore under its Campus for Research Excellence and Technological Enterprise (CREATE) Program with the Technical University of Munich at TUMCREATE; and in part by the Ministry of Education, Singapore, under its Academic Research Fund Tier 1, under Grant RG78/21. 2024-06-28T00:38:36Z 2024-06-28T00:38:36Z 2022 Journal Article Zhou, C., Wu, M. & Lam, S. (2022). Enhanced multi-task learning architecture for detecting pedestrian at far distance. IEEE Transactions On Intelligent Transportation Systems, 23(9), 15588-15604. https://dx.doi.org/10.1109/TITS.2022.3142445 1524-9050 https://hdl.handle.net/10356/178581 10.1109/TITS.2022.3142445 2-s2.0-85123757148 9 23 15588 15604 en RG78/21 IEEE Transactions on Intelligent Transportation Systems © 2022 IEEE. All rights reserved.
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Computer and Information Science
Multi-task learning
Pedestrian detection
spellingShingle Computer and Information Science
Multi-task learning
Pedestrian detection
Zhou, Chengju
Wu, Meiqing
Lam, Siew-Kei
Enhanced multi-task learning architecture for detecting pedestrian at far distance
description Existing pedestrian detection methods suffer from performance degradation in the presence of small-scale pedestrians who are positioned at far distance from the camera. We present a pedestrian detection framework that is not only robust to small- and large-scale pedestrians, but is also significantly faster than state-of-the-art methods. The proposed framework incorporates semantic segmentation to confidence modules for RPN (Region Proposal Network) head and R-FCN (Region-based Fully Convolutional Networks) head, and a cascaded R-FCN head. The semantic segmentation confidence is extracted and utilized as auxiliary classification prior knowledge for RPN proposal selection and R-FCN head prediction. Finally, the cascaded R-FCN head progressively refine the pedestrian prediction accuracy with negligible computation overhead. The proposed framework is also capable of maintaining high detection performance on down-sampled input images, which leads to further reduction in overall computational complexity. Experiment results on CityPersons and MOT17Det datasets show that the proposed framework achieves competitive detection performance with about 3× speedup over state-of-the-art methods.
author2 College of Computing and Data Science
author_facet College of Computing and Data Science
Zhou, Chengju
Wu, Meiqing
Lam, Siew-Kei
format Article
author Zhou, Chengju
Wu, Meiqing
Lam, Siew-Kei
author_sort Zhou, Chengju
title Enhanced multi-task learning architecture for detecting pedestrian at far distance
title_short Enhanced multi-task learning architecture for detecting pedestrian at far distance
title_full Enhanced multi-task learning architecture for detecting pedestrian at far distance
title_fullStr Enhanced multi-task learning architecture for detecting pedestrian at far distance
title_full_unstemmed Enhanced multi-task learning architecture for detecting pedestrian at far distance
title_sort enhanced multi-task learning architecture for detecting pedestrian at far distance
publishDate 2024
url https://hdl.handle.net/10356/178581
_version_ 1814047104740360192