Person re-identification based on large vision-language model

Person re-identification (ReID) is a task within computer vision that seeks to accurately recognize and match individuals across disjoint camera views, notwithstanding variations in viewpoint and illumination conditions. With the development of large vision-Language model and the substantial demand...

全面介紹

Saved in:

書目詳細資料
主要作者:	Ding, Songyu
其他作者:	Alex Chichung Kot
格式:	Thesis-Master by Coursework
語言:	English
出版:	Nanyang Technological University 2024
主題:	Engineering Person re-identification Large vision-language model Pedestrian retrieval
在線閱讀:	https://hdl.handle.net/10356/176052
標簽:	添加標簽沒有標簽, 成為第一個標記此記錄!

id	sg-ntu-dr.10356-176052
record_format	dspace
spelling	sg-ntu-dr.10356-1760522024-05-17T15:48:56Z Person re-identification based on large vision-language model Ding, Songyu Alex Chichung Kot School of Electrical and Electronic Engineering EACKOT@ntu.edu.sg Engineering Person re-identification Large vision-language model Pedestrian retrieval Person re-identification (ReID) is a task within computer vision that seeks to accurately recognize and match individuals across disjoint camera views, notwithstanding variations in viewpoint and illumination conditions. With the development of large vision-Language model and the substantial demand in the surveillance sectors, research on ReID with text descriptions has also gained significantly increased interest. Due to the varying roles of language in the ReID task, we categorize ReID based on large vision-language models into Language Assist Image Person ReID (LAIPR) and Language Based Image Person ReID (LBIPR). The LAIPR task primarily leverages the inherent content generation capability of large models to provide additional semantic information about images, aiding in more accurate matching and identification of individuals across different datasets. We first review basic ReID systems and conducted an in-depth analysis of the specific implementation and effectiveness of the LAIPR task. Language plays a crucial role in providing descriptive clues, aiding in better understanding and matching of individual identities, especially in cases where visual cues alone may not be sufficient. As a result, there has been a recent shift in research attention towards the task of LBIPR, which poses more formidable challenges. We have synthesized prominent methodologies in the LBIPR domain and conducted a comparative analysis of their performance. Furthermore, we discuss yet underexplored areas warranting further investigation. Master's degree 2024-05-13T08:16:17Z 2024-05-13T08:16:17Z 2024 Thesis-Master by Coursework Ding, S. (2024). Person re-identification based on large vision-language model. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/176052 https://hdl.handle.net/10356/176052 en application/pdf Nanyang Technological University
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering Person re-identification Large vision-language model Pedestrian retrieval
spellingShingle	Engineering Person re-identification Large vision-language model Pedestrian retrieval Ding, Songyu Person re-identification based on large vision-language model
description	Person re-identification (ReID) is a task within computer vision that seeks to accurately recognize and match individuals across disjoint camera views, notwithstanding variations in viewpoint and illumination conditions. With the development of large vision-Language model and the substantial demand in the surveillance sectors, research on ReID with text descriptions has also gained significantly increased interest. Due to the varying roles of language in the ReID task, we categorize ReID based on large vision-language models into Language Assist Image Person ReID (LAIPR) and Language Based Image Person ReID (LBIPR). The LAIPR task primarily leverages the inherent content generation capability of large models to provide additional semantic information about images, aiding in more accurate matching and identification of individuals across different datasets. We first review basic ReID systems and conducted an in-depth analysis of the specific implementation and effectiveness of the LAIPR task. Language plays a crucial role in providing descriptive clues, aiding in better understanding and matching of individual identities, especially in cases where visual cues alone may not be sufficient. As a result, there has been a recent shift in research attention towards the task of LBIPR, which poses more formidable challenges. We have synthesized prominent methodologies in the LBIPR domain and conducted a comparative analysis of their performance. Furthermore, we discuss yet underexplored areas warranting further investigation.
author2	Alex Chichung Kot
author_facet	Alex Chichung Kot Ding, Songyu
format	Thesis-Master by Coursework
author	Ding, Songyu
author_sort	Ding, Songyu
title	Person re-identification based on large vision-language model
title_short	Person re-identification based on large vision-language model
title_full	Person re-identification based on large vision-language model
title_fullStr	Person re-identification based on large vision-language model
title_full_unstemmed	Person re-identification based on large vision-language model
title_sort	person re-identification based on large vision-language model
publisher	Nanyang Technological University
publishDate	2024
url	https://hdl.handle.net/10356/176052
_version_	1806059784136294400

Person re-identification based on large vision-language model

相似書籍