Object counting using machine learning

In this thesis, to improve the accuracy of multi-modal crowd count estimation, a three-stream adaptive fusion network (TAFNet) and a scale-aware self-differential attention network (SDANet) are proposed. The proposed TAFNet is adopted to adaptively extract and fuse the optical information with therm...

Full description

Saved in:
Bibliographic Details
Main Author: Tang, Haihan
Other Authors: Lin Zhiping
Format: Thesis-Master by Research
Language:English
Published: Nanyang Technological University 2022
Subjects:
Online Access:https://hdl.handle.net/10356/162531
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:In this thesis, to improve the accuracy of multi-modal crowd count estimation, a three-stream adaptive fusion network (TAFNet) and a scale-aware self-differential attention network (SDANet) are proposed. The proposed TAFNet is adopted to adaptively extract and fuse the optical information with thermal information, increasing the effectiveness of multi-modal information fusing. The proposed SDANet utilizes multi-scale features to estimate the density map and predict crowd number, which solves the scale variation problem of crowds. Several novel modules are proposed to highlight the scale information and avoid information redundancy. The experiments on RGBT-CC benchmark show the effectiveness of proposed methods for RGB-T crowd counting compared with state-of-the-art methods. The experiments on ShanghaitechRGBD benchmark demonstrate that proposed networks are capable of RGB-D crowd counting. In addition, the estimated density maps have high quality and are close to the ground truth density maps.