Learning disentangled representation implicitly via transformer for occluded person re-identification
Person re-IDentification (re-ID) under various occlusions has been a long-standing challenge as person images with different types of occlusions often suffer from misalignment in image matching and ranking. Most existing methods tackle this challenge by aligning spatial features of body parts accord...
Saved in:
Main Authors: | , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2022
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/162960 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-162960 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1629602022-11-14T01:42:43Z Learning disentangled representation implicitly via transformer for occluded person re-identification Jia, Mengxi Cheng, Xinhua Lu, Shijian Zhang, Jian School of Computer Science and Engineering Engineering::Computer science and engineering Person Re-Identification Representation Learning Person re-IDentification (re-ID) under various occlusions has been a long-standing challenge as person images with different types of occlusions often suffer from misalignment in image matching and ranking. Most existing methods tackle this challenge by aligning spatial features of body parts according to external semantic cues or feature similarities but this alignment approach is complicated and sensitive to noises. We design DRL-Net, a disentangled representation learning network that handles occluded re-ID without requiring strict person image alignment or any additional supervision. Leveraging transformer architectures, DRL-Net achieves alignment-free re-ID via global reasoning of local features of occluded person images. It measures image similarity by automatically disentangling the representation of undefined semantic components, e.g., human body parts or obstacles, under the guidance of semantic preference object queries in the transformer. In addition, we design a decorrelation constraint in the transformer decoder and impose it over object queries for better focus on different semantic components. To better eliminate interference from occlusions, we design a contrast feature learning technique (CFL) for better separation of occlusion features and discriminative ID features. Extensive experiments over occluded and holistic reID benchmarks show that the DRL-Net achieves superior re-ID performance consistently and outperforms the state-of-the-art by large margins for occluded re-ID dataset. This work was supported in part by Shenzhen Fundamental Research Program (No.GXWD20201231165807007-20200807164903001). 2022-11-14T01:42:42Z 2022-11-14T01:42:42Z 2022 Journal Article Jia, M., Cheng, X., Lu, S. & Zhang, J. (2022). Learning disentangled representation implicitly via transformer for occluded person re-identification. IEEE Transactions On Multimedia, 3141267-. https://dx.doi.org/10.1109/TMM.2022.3141267 1520-9210 https://hdl.handle.net/10356/162960 10.1109/TMM.2022.3141267 2-s2.0-85122878986 3141267 en IEEE Transactions on Multimedia © 2021 IEEE. All rights reserved. |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Computer science and engineering Person Re-Identification Representation Learning |
spellingShingle |
Engineering::Computer science and engineering Person Re-Identification Representation Learning Jia, Mengxi Cheng, Xinhua Lu, Shijian Zhang, Jian Learning disentangled representation implicitly via transformer for occluded person re-identification |
description |
Person re-IDentification (re-ID) under various occlusions has been a long-standing challenge as person images with different types of occlusions often suffer from misalignment in image matching and ranking. Most existing methods tackle this challenge by aligning spatial features of body parts according to external semantic cues or feature similarities but this alignment approach is complicated and sensitive to noises. We design DRL-Net, a disentangled representation learning network that handles occluded re-ID without requiring strict person image alignment or any additional supervision. Leveraging transformer architectures, DRL-Net achieves alignment-free re-ID via global reasoning of local features of occluded person images. It measures image similarity by automatically disentangling the representation of undefined semantic components, e.g., human body parts or obstacles, under the guidance of semantic preference object queries in the transformer. In addition, we design a decorrelation constraint in the transformer decoder and impose it over object queries for better focus on different semantic components. To better eliminate interference from occlusions, we design a contrast feature learning technique (CFL) for better separation of occlusion features and discriminative ID features. Extensive experiments over occluded and holistic reID benchmarks show that the DRL-Net achieves superior re-ID performance consistently and outperforms the state-of-the-art by large margins for occluded re-ID dataset. |
author2 |
School of Computer Science and Engineering |
author_facet |
School of Computer Science and Engineering Jia, Mengxi Cheng, Xinhua Lu, Shijian Zhang, Jian |
format |
Article |
author |
Jia, Mengxi Cheng, Xinhua Lu, Shijian Zhang, Jian |
author_sort |
Jia, Mengxi |
title |
Learning disentangled representation implicitly via transformer for occluded person re-identification |
title_short |
Learning disentangled representation implicitly via transformer for occluded person re-identification |
title_full |
Learning disentangled representation implicitly via transformer for occluded person re-identification |
title_fullStr |
Learning disentangled representation implicitly via transformer for occluded person re-identification |
title_full_unstemmed |
Learning disentangled representation implicitly via transformer for occluded person re-identification |
title_sort |
learning disentangled representation implicitly via transformer for occluded person re-identification |
publishDate |
2022 |
url |
https://hdl.handle.net/10356/162960 |
_version_ |
1751548546070347776 |