CARF-net : CNN attention and RNN fusion network for video-based person reidentification

Video-based person reidentification is a challenging and important task in surveillance-based applications. Toward this, several shallow and deep networks have been proposed. However, the performance of existing shallow networks does not generalize well on large datasets. To improve the generalizati...

Full description

Saved in:

Bibliographic Details
Main Authors:	Prasad, Dilip Kumar, Kansal, Kajal, Venkata, Subramanyam, Kankanhalli, Mohan
Other Authors:	School of Computer Science and Engineering
Format:	Article
Language:	English
Published:	2019
Subjects:	DRNTU::Engineering::Computer science and engineering Attentions Convolutional Neural Network-Recurrent Neural Network
Online Access:	https://hdl.handle.net/10356/105466 http://hdl.handle.net/10220/48712 http://dx.doi.org/10.1117/1.JEI.28.2.023036
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-105466
record_format	dspace
spelling	sg-ntu-dr.10356-1054662019-12-06T21:51:54Z CARF-net : CNN attention and RNN fusion network for video-based person reidentification Prasad, Dilip Kumar Kansal, Kajal Venkata, Subramanyam Kankanhalli, Mohan School of Computer Science and Engineering DRNTU::Engineering::Computer science and engineering Attentions Convolutional Neural Network-Recurrent Neural Network Video-based person reidentification is a challenging and important task in surveillance-based applications. Toward this, several shallow and deep networks have been proposed. However, the performance of existing shallow networks does not generalize well on large datasets. To improve the generalization ability, we propose a shallow end-to-end network which incorporates two stream convolutional neural networks, discriminative visual attention and recurrent neural network with triplet and softmax loss to learn the spatiotemporal fusion features. To effectively use both spatial and temporal information, we apply spatial, temporal, and spatiotemporal pooling. In addition, we contribute a large dataset of airborne videos for person reidentification, named DJI01. It includes various challenging conditions, such as occlusion, illuminationchanges, people with similar clothes, and the same people on different days. We perform elaborate qualitative and quantitative analyses to demonstrate the robust performance of the proposed model. Published version 2019-06-13T04:06:41Z 2019-12-06T21:51:54Z 2019-06-13T04:06:41Z 2019-12-06T21:51:54Z 2019 Journal Article Kansal, K., Venkata, S., Prasad, D. K., & Kankanhalli, M. (2019). CARF-net : CNN attention and RNN fusion network for video-based person reidentification. Journal of Electronic Imaging, 28(2), 023036-. doi:10.1117/1.JEI.28.2.023036 1017-9909 https://hdl.handle.net/10356/105466 http://hdl.handle.net/10220/48712 http://dx.doi.org/10.1117/1.JEI.28.2.023036 en Journal of Electronic Imaging © 2019 SPIE and IS&T. All rights reserved. This paper was published in Journal of Electronic Imaging and is made available with permission of SPIE and IS&T. 13 p. application/pdf
institution	Nanyang Technological University
building	NTU Library
country	Singapore
collection	DR-NTU
language	English
topic	DRNTU::Engineering::Computer science and engineering Attentions Convolutional Neural Network-Recurrent Neural Network
spellingShingle	DRNTU::Engineering::Computer science and engineering Attentions Convolutional Neural Network-Recurrent Neural Network Prasad, Dilip Kumar Kansal, Kajal Venkata, Subramanyam Kankanhalli, Mohan CARF-net : CNN attention and RNN fusion network for video-based person reidentification
description	Video-based person reidentification is a challenging and important task in surveillance-based applications. Toward this, several shallow and deep networks have been proposed. However, the performance of existing shallow networks does not generalize well on large datasets. To improve the generalization ability, we propose a shallow end-to-end network which incorporates two stream convolutional neural networks, discriminative visual attention and recurrent neural network with triplet and softmax loss to learn the spatiotemporal fusion features. To effectively use both spatial and temporal information, we apply spatial, temporal, and spatiotemporal pooling. In addition, we contribute a large dataset of airborne videos for person reidentification, named DJI01. It includes various challenging conditions, such as occlusion, illuminationchanges, people with similar clothes, and the same people on different days. We perform elaborate qualitative and quantitative analyses to demonstrate the robust performance of the proposed model.
author2	School of Computer Science and Engineering
author_facet	School of Computer Science and Engineering Prasad, Dilip Kumar Kansal, Kajal Venkata, Subramanyam Kankanhalli, Mohan
format	Article
author	Prasad, Dilip Kumar Kansal, Kajal Venkata, Subramanyam Kankanhalli, Mohan
author_sort	Prasad, Dilip Kumar
title	CARF-net : CNN attention and RNN fusion network for video-based person reidentification
title_short	CARF-net : CNN attention and RNN fusion network for video-based person reidentification
title_full	CARF-net : CNN attention and RNN fusion network for video-based person reidentification
title_fullStr	CARF-net : CNN attention and RNN fusion network for video-based person reidentification
title_full_unstemmed	CARF-net : CNN attention and RNN fusion network for video-based person reidentification
title_sort	carf-net : cnn attention and rnn fusion network for video-based person reidentification
publishDate	2019
url	https://hdl.handle.net/10356/105466 http://hdl.handle.net/10220/48712 http://dx.doi.org/10.1117/1.JEI.28.2.023036
_version_	1681036797994336256

CARF-net : CNN attention and RNN fusion network for video-based person reidentification

Similar Items