Object counting using machine learning

In this thesis, to improve the accuracy of multi-modal crowd count estimation, a three-stream adaptive fusion network (TAFNet) and a scale-aware self-differential attention network (SDANet) are proposed. The proposed TAFNet is adopted to adaptively extract and fuse the optical information with therm...

全面介紹

Saved in:
書目詳細資料
主要作者: Tang, Haihan
其他作者: Lin Zhiping
格式: Thesis-Master by Research
語言:English
出版: Nanyang Technological University 2022
主題:
在線閱讀:https://hdl.handle.net/10356/162531
標簽: 添加標簽
沒有標簽, 成為第一個標記此記錄!
機構: Nanyang Technological University
語言: English
實物特徵
總結:In this thesis, to improve the accuracy of multi-modal crowd count estimation, a three-stream adaptive fusion network (TAFNet) and a scale-aware self-differential attention network (SDANet) are proposed. The proposed TAFNet is adopted to adaptively extract and fuse the optical information with thermal information, increasing the effectiveness of multi-modal information fusing. The proposed SDANet utilizes multi-scale features to estimate the density map and predict crowd number, which solves the scale variation problem of crowds. Several novel modules are proposed to highlight the scale information and avoid information redundancy. The experiments on RGBT-CC benchmark show the effectiveness of proposed methods for RGB-T crowd counting compared with state-of-the-art methods. The experiments on ShanghaitechRGBD benchmark demonstrate that proposed networks are capable of RGB-D crowd counting. In addition, the estimated density maps have high quality and are close to the ground truth density maps.