Person re-identification based on large vision-language model
Person re-identification (ReID) is a task within computer vision that seeks to accurately recognize and match individuals across disjoint camera views, notwithstanding variations in viewpoint and illumination conditions. With the development of large vision-Language model and the substantial demand...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Master by Coursework |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/176052 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Person re-identification (ReID) is a task within computer vision that seeks to accurately recognize and match individuals across disjoint camera views, notwithstanding variations in viewpoint and illumination conditions. With the development of large vision-Language model and the substantial demand in the surveillance sectors, research on ReID with text descriptions has also gained significantly increased interest. Due to the varying roles of language in the ReID task, we categorize ReID based on large vision-language models into Language Assist Image Person ReID (LAIPR) and Language Based Image Person ReID (LBIPR). The LAIPR task primarily leverages the inherent content generation capability of large models to provide additional semantic information about images, aiding in more accurate matching and identification of individuals across different datasets. We first review basic ReID systems and conducted an in-depth analysis of the specific implementation and effectiveness of the LAIPR task. Language plays a crucial role in providing descriptive clues, aiding in better understanding and matching of individual identities, especially in cases where visual cues alone may not be sufficient. As a result, there has been a recent shift in research attention towards the task of LBIPR, which poses more formidable challenges. We have synthesized prominent methodologies in the LBIPR domain and conducted a comparative analysis of their performance. Furthermore, we discuss yet underexplored areas warranting further investigation. |
---|