Visual recognition using artificial intelligence (person detection and tracking using artificial intelligence)
This report motive is to produce a robust real-time CCTV monitoring system aid to assist security guards, using only input video footage an object detection machine learning model and a tracking algorithm. The objective of this report is to investigate and determine which object tracking machin...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2022
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/157722 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | This report motive is to produce a robust real-time CCTV monitoring system aid to assist
security guards, using only input video footage an object detection machine learning model
and a tracking algorithm. The objective of this report is to investigate and determine which
object tracking machine learning model and tracking algorithm is best for the task.
The report presents the precision, recall, and F1 score of different versions of the state-of-the art object detection model family, You Only Look Once (YOLO), and the precision,
swapping of track identification rates, new track identification rates, and speed of the state of-the-art multiple object-tracking algorithm family, Simple Online and Realtime Tracking
(SORT).
Different object detection machine learning models were trained and validated on 60,796
images of people in different angles, background, field of view and different scenarios. Once
the best object detector was determined it was combined with different tracking algorithms
and benchmarked on a labelled testing dataset consisting of 697 images/frames.
The results of the research shows, in terms of detection YOLOv5-XLarge had the best
detection performance while YOLOv5-Nano had the fastest speed. However, for a real-time
CCTV monitoring system a balance between the detection performance and speed is key
hence YOLOv5-Small is the most suitable for the task. In terms of tracking, Deep SORT with
osnet_x0_5 has the best average precision while SORT has the best speed. While SORT has
the best speed, Deep SORT with osnet_x0_5 speed is comparable to SORT but with a much
larger precision value and consistency of tracking. Hence Deep SORT with osnet_x0_5
would be chosen as the tracking algorithm.
Hence to produce a CCTV monitoring system to aid security guards, object detection from
YOLOv5-Small Machine Learning Model would be combined with tracking using Deep
SORT with osnet_x0_5 tracking algorithm. |
---|