Transformer-based domain generalization of person re-identification

Person re-identification (Re-ID), is a technology that uses computer vision technology to determine whether there is a specific pedestrian in an image or video sequence. The purpose of domain generalizable (DG) person Re-ID is to train a robust person Re-ID model with great generalizability that can...

Full description

Saved in:
Bibliographic Details
Main Author: Li, Yiming
Other Authors: Yap Kim Hui
Format: Thesis-Master by Coursework
Language:English
Published: Nanyang Technological University 2022
Subjects:
Online Access:https://hdl.handle.net/10356/158940
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-158940
record_format dspace
spelling sg-ntu-dr.10356-1589402023-07-04T15:36:48Z Transformer-based domain generalization of person re-identification Li, Yiming Yap Kim Hui School of Electrical and Electronic Engineering Agency for Science, Technology and Research (A*STAR) Schaeffler Hub for Advanced REsearch (SHARE) Lab EKHYap@ntu.edu.sg Engineering::Electrical and electronic engineering Person re-identification (Re-ID), is a technology that uses computer vision technology to determine whether there is a specific pedestrian in an image or video sequence. The purpose of domain generalizable (DG) person Re-ID is to train a robust person Re-ID model with great generalizability that can achieve relatively high accuracy on unseen datasets. Although some CNN-based models achieve high accuracy on cross-domain evaluations, there is still a lot of room for improvement. The original Transformer [11] has been widely used in natural language processing area since 2017. It uses self-attention mechanism to update the embedding. In computer vision, some methods using Transformer are proposed to solve the long-range correlation extraction problem. However, the matching ability of transformer-based DG Re-ID has not been studied yet. This dissertation proposed a pipeline with a CNN-based backbone feature extractor and a Transformer-based encoder-decoder module to solve the domain generalization problem of person re-identification. Some pre-processing and post-processing techniques are used to achieve higher accuracy such as reranking, BNNeck and temporal lift. The ablation studies of parameters in these modules are employed. The result analysis and future prospects are discussed. Master of Science (Communications Engineering) 2022-06-02T12:06:18Z 2022-06-02T12:06:18Z 2022 Thesis-Master by Coursework Li, Y. (2022). Transformer-based domain generalization of person re-identification. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/158940 https://hdl.handle.net/10356/158940 en ICP1900093 application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Electrical and electronic engineering
spellingShingle Engineering::Electrical and electronic engineering
Li, Yiming
Transformer-based domain generalization of person re-identification
description Person re-identification (Re-ID), is a technology that uses computer vision technology to determine whether there is a specific pedestrian in an image or video sequence. The purpose of domain generalizable (DG) person Re-ID is to train a robust person Re-ID model with great generalizability that can achieve relatively high accuracy on unseen datasets. Although some CNN-based models achieve high accuracy on cross-domain evaluations, there is still a lot of room for improvement. The original Transformer [11] has been widely used in natural language processing area since 2017. It uses self-attention mechanism to update the embedding. In computer vision, some methods using Transformer are proposed to solve the long-range correlation extraction problem. However, the matching ability of transformer-based DG Re-ID has not been studied yet. This dissertation proposed a pipeline with a CNN-based backbone feature extractor and a Transformer-based encoder-decoder module to solve the domain generalization problem of person re-identification. Some pre-processing and post-processing techniques are used to achieve higher accuracy such as reranking, BNNeck and temporal lift. The ablation studies of parameters in these modules are employed. The result analysis and future prospects are discussed.
author2 Yap Kim Hui
author_facet Yap Kim Hui
Li, Yiming
format Thesis-Master by Coursework
author Li, Yiming
author_sort Li, Yiming
title Transformer-based domain generalization of person re-identification
title_short Transformer-based domain generalization of person re-identification
title_full Transformer-based domain generalization of person re-identification
title_fullStr Transformer-based domain generalization of person re-identification
title_full_unstemmed Transformer-based domain generalization of person re-identification
title_sort transformer-based domain generalization of person re-identification
publisher Nanyang Technological University
publishDate 2022
url https://hdl.handle.net/10356/158940
_version_ 1772828039774732288