Deep learning for multiple object tracking

The past several years has seen the rapid development of multiple object tracking object detection and re-identification. Most of work focuses on pedestrian body tracking with one-shot anchor-free structure and few work is conducted on face tracking. The main reason is that the pedestrian tracking a...

Full description

Saved in:
Bibliographic Details
Main Author: Gao, Junjie
Other Authors: Lin Zhiping
Format: Thesis-Master by Coursework
Language:English
Published: Nanyang Technological University 2022
Subjects:
Online Access:https://hdl.handle.net/10356/161058
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:The past several years has seen the rapid development of multiple object tracking object detection and re-identification. Most of work focuses on pedestrian body tracking with one-shot anchor-free structure and few work is conducted on face tracking. The main reason is that the pedestrian tracking always conduct on surveillance video with limited resolution. Human faces in these videos are usually not clear enough to distinguish them from each other. As the result, body tracking became an important alternative technology when face recognition fails. This dissertation will focus on the face tracking when the resolution of video is high enough. Human face detector usually apply anchor-based detector to obtain the better performance. However, one-shot anchor-based detector performs badly on re-identification task because of serious network fuzziness. Our face detection network applies the anchor-free structure on face detector and the performance is just slightly worse than the state-of-the-arts anchor-based face detector. Traditional method to train the re-identification branch usually append a full connection layer after the output of extracted id feature to do id classification. My work combines this traditional strategy with the idea of metric learning together to ensure the robustness of trained identity information.