A unified multi-task learning architecture for fast and accurate pedestrian detection

We present a unified multi-task learning architecture for fast and accurate pedestrian detection. Different from existing methods which often focus on either a new loss function or architecture, we propose an improved multi-task convolutional neural network learning architecture to effectively and...

Full description

Saved in:

Bibliographic Details
Main Authors:	Zhou, Chengju, Wu, Meiqing, Lam, Siew-Kei
Other Authors:	School of Computer Science and Engineering
Format:	Article
Language:	English
Published:	2021
Subjects:	Engineering::Computer science and engineering Multi-Task Learning Pedestrian Detection Semantic Segmentation Feature Aggregation
Online Access:	https://hdl.handle.net/10356/147488
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-147488
record_format	dspace
spelling	sg-ntu-dr.10356-1474882024-08-05T02:04:46Z A unified multi-task learning architecture for fast and accurate pedestrian detection Zhou, Chengju Wu, Meiqing Lam, Siew-Kei School of Computer Science and Engineering Engineering::Computer science and engineering Multi-Task Learning Pedestrian Detection Semantic Segmentation Feature Aggregation We present a unified multi-task learning architecture for fast and accurate pedestrian detection. Different from existing methods which often focus on either a new loss function or architecture, we propose an improved multi-task convolutional neural network learning architecture to effectively and efficiently interfuse the task of pedestrian detection and semantic segmentation. To achieve this, we integrate a lightweight semantic segmentation branch to Faster R-CNN detection framework that enables end-to-end hard parameter sharing in order to boost the detection performance and maintain computational efficiency as follows. Firstly, a Semantic Segmentation to Feature Module (SS2FM) refines the convolutional features in RPN stage by integrating the features generated from the semantic segmentation branch. Secondly, a Semantic Segmentation to Confidence Module (SS2CM) refines the classification confidence in RPN stage by fusing it with the semantic segmentation confidence. We also introduce an effective anchor matching point transform to alleviate the problem of feature misalignment for heavily occluded pedestrians. The proposed unified multi-task learning architecture lends itself well to more robust pedestrian detection in diverse scenarios with negligible computation overhead. In addition, the proposed architecture can achieve high detection performance with low resolution input images, which significantly reduces the computational complexity. Experiment results on CityPersons and Caltech datasets show that our method is the fastest among all state-of-the-art pedestrian detection methods while exhibiting competitive detection performance. National Research Foundation (NRF) Accepted version This work was supported in part by the National Research Foundation Singapore under its Campus for Research Excellence and Technological Enterprise (CREATE) Program with the Technical University of Munich at TUMCREATE. 2021-12-08T12:43:14Z 2021-12-08T12:43:14Z 2020 Journal Article Zhou, C., Wu, M. & Lam, S. (2020). A unified multi-task learning architecture for fast and accurate pedestrian detection. IEEE Transactions On Intelligent Transportation Systems, 23(2), 982-996. https://dx.doi.org/10.1109/TITS.2020.3019390 1524-9050 https://hdl.handle.net/10356/147488 10.1109/TITS.2020.3019390 2 23 982 996 en IEEE Transactions on Intelligent Transportation Systems © 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: https://doi.org/10.1109/TITS.2020.3019390. application/pdf
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Computer science and engineering Multi-Task Learning Pedestrian Detection Semantic Segmentation Feature Aggregation
spellingShingle	Engineering::Computer science and engineering Multi-Task Learning Pedestrian Detection Semantic Segmentation Feature Aggregation Zhou, Chengju Wu, Meiqing Lam, Siew-Kei A unified multi-task learning architecture for fast and accurate pedestrian detection
description	We present a unified multi-task learning architecture for fast and accurate pedestrian detection. Different from existing methods which often focus on either a new loss function or architecture, we propose an improved multi-task convolutional neural network learning architecture to effectively and efficiently interfuse the task of pedestrian detection and semantic segmentation. To achieve this, we integrate a lightweight semantic segmentation branch to Faster R-CNN detection framework that enables end-to-end hard parameter sharing in order to boost the detection performance and maintain computational efficiency as follows. Firstly, a Semantic Segmentation to Feature Module (SS2FM) refines the convolutional features in RPN stage by integrating the features generated from the semantic segmentation branch. Secondly, a Semantic Segmentation to Confidence Module (SS2CM) refines the classification confidence in RPN stage by fusing it with the semantic segmentation confidence. We also introduce an effective anchor matching point transform to alleviate the problem of feature misalignment for heavily occluded pedestrians. The proposed unified multi-task learning architecture lends itself well to more robust pedestrian detection in diverse scenarios with negligible computation overhead. In addition, the proposed architecture can achieve high detection performance with low resolution input images, which significantly reduces the computational complexity. Experiment results on CityPersons and Caltech datasets show that our method is the fastest among all state-of-the-art pedestrian detection methods while exhibiting competitive detection performance.
author2	School of Computer Science and Engineering
author_facet	School of Computer Science and Engineering Zhou, Chengju Wu, Meiqing Lam, Siew-Kei
format	Article
author	Zhou, Chengju Wu, Meiqing Lam, Siew-Kei
author_sort	Zhou, Chengju
title	A unified multi-task learning architecture for fast and accurate pedestrian detection
title_short	A unified multi-task learning architecture for fast and accurate pedestrian detection
title_full	A unified multi-task learning architecture for fast and accurate pedestrian detection
title_fullStr	A unified multi-task learning architecture for fast and accurate pedestrian detection
title_full_unstemmed	A unified multi-task learning architecture for fast and accurate pedestrian detection
title_sort	unified multi-task learning architecture for fast and accurate pedestrian detection
publishDate	2021
url	https://hdl.handle.net/10356/147488
_version_	1814047210345594880

A unified multi-task learning architecture for fast and accurate pedestrian detection

Similar Items