Enhanced multi-task learning architecture for detecting pedestrian at far distance
Existing pedestrian detection methods suffer from performance degradation in the presence of small-scale pedestrians who are positioned at far distance from the camera. We present a pedestrian detection framework that is not only robust to small- and large-scale pedestrians, but is also significantl...
Saved in:
Main Authors: | , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/178581 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-178581 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1785812024-06-28T00:38:36Z Enhanced multi-task learning architecture for detecting pedestrian at far distance Zhou, Chengju Wu, Meiqing Lam, Siew-Kei College of Computing and Data Science School of Computer Science and Engineering Computer and Information Science Multi-task learning Pedestrian detection Existing pedestrian detection methods suffer from performance degradation in the presence of small-scale pedestrians who are positioned at far distance from the camera. We present a pedestrian detection framework that is not only robust to small- and large-scale pedestrians, but is also significantly faster than state-of-the-art methods. The proposed framework incorporates semantic segmentation to confidence modules for RPN (Region Proposal Network) head and R-FCN (Region-based Fully Convolutional Networks) head, and a cascaded R-FCN head. The semantic segmentation confidence is extracted and utilized as auxiliary classification prior knowledge for RPN proposal selection and R-FCN head prediction. Finally, the cascaded R-FCN head progressively refine the pedestrian prediction accuracy with negligible computation overhead. The proposed framework is also capable of maintaining high detection performance on down-sampled input images, which leads to further reduction in overall computational complexity. Experiment results on CityPersons and MOT17Det datasets show that the proposed framework achieves competitive detection performance with about 3× speedup over state-of-the-art methods. Ministry of Education (MOE) National Research Foundation (NRF) This work was supported in part by the National Research Foundation Singapore under its Campus for Research Excellence and Technological Enterprise (CREATE) Program with the Technical University of Munich at TUMCREATE; and in part by the Ministry of Education, Singapore, under its Academic Research Fund Tier 1, under Grant RG78/21. 2024-06-28T00:38:36Z 2024-06-28T00:38:36Z 2022 Journal Article Zhou, C., Wu, M. & Lam, S. (2022). Enhanced multi-task learning architecture for detecting pedestrian at far distance. IEEE Transactions On Intelligent Transportation Systems, 23(9), 15588-15604. https://dx.doi.org/10.1109/TITS.2022.3142445 1524-9050 https://hdl.handle.net/10356/178581 10.1109/TITS.2022.3142445 2-s2.0-85123757148 9 23 15588 15604 en RG78/21 IEEE Transactions on Intelligent Transportation Systems © 2022 IEEE. All rights reserved. |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Computer and Information Science Multi-task learning Pedestrian detection |
spellingShingle |
Computer and Information Science Multi-task learning Pedestrian detection Zhou, Chengju Wu, Meiqing Lam, Siew-Kei Enhanced multi-task learning architecture for detecting pedestrian at far distance |
description |
Existing pedestrian detection methods suffer from performance degradation in the presence of small-scale pedestrians who are positioned at far distance from the camera. We present a pedestrian detection framework that is not only robust to small- and large-scale pedestrians, but is also significantly faster than state-of-the-art methods. The proposed framework incorporates semantic segmentation to confidence modules for RPN (Region Proposal Network) head and R-FCN (Region-based Fully Convolutional Networks) head, and a cascaded R-FCN head. The semantic segmentation confidence is extracted and utilized as auxiliary classification prior knowledge for RPN proposal selection and R-FCN head prediction. Finally, the cascaded R-FCN head progressively refine the pedestrian prediction accuracy with negligible computation overhead. The proposed framework is also capable of maintaining high detection performance on down-sampled input images, which leads to further reduction in overall computational complexity. Experiment results on CityPersons and MOT17Det datasets show that the proposed framework achieves competitive detection performance with about 3× speedup over state-of-the-art methods. |
author2 |
College of Computing and Data Science |
author_facet |
College of Computing and Data Science Zhou, Chengju Wu, Meiqing Lam, Siew-Kei |
format |
Article |
author |
Zhou, Chengju Wu, Meiqing Lam, Siew-Kei |
author_sort |
Zhou, Chengju |
title |
Enhanced multi-task learning architecture for detecting pedestrian at far distance |
title_short |
Enhanced multi-task learning architecture for detecting pedestrian at far distance |
title_full |
Enhanced multi-task learning architecture for detecting pedestrian at far distance |
title_fullStr |
Enhanced multi-task learning architecture for detecting pedestrian at far distance |
title_full_unstemmed |
Enhanced multi-task learning architecture for detecting pedestrian at far distance |
title_sort |
enhanced multi-task learning architecture for detecting pedestrian at far distance |
publishDate |
2024 |
url |
https://hdl.handle.net/10356/178581 |
_version_ |
1814047104740360192 |