AV-FDTI: audio-visual fusion for drone threat identification

In response to the evolving challenges posed by small unmanned aerial vehicles (UAVs), which have the potential to transport harmful payloads or cause significant damage, we present AV-FDTI, an innovative Audio-Visual Fusion system designed for Drone Threat Identification. AV-FDTI leverages the fusi...

Full description

Saved in:
Bibliographic Details
Main Authors: Yang, Yizhuo, Yuan, Shenghai, Yang, Jianfei, Nguyen, Thien Hoang, Cao, Muqing, Nguyen, Thien-Minh, Wang, Han, Xie, Lihua
Other Authors: School of Electrical and Electronic Engineering
Format: Article
Language:English
Published: 2024
Subjects:
Online Access:https://hdl.handle.net/10356/181381
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-181381
record_format dspace
spelling sg-ntu-dr.10356-1813812024-11-29T15:41:44Z AV-FDTI: audio-visual fusion for drone threat identification Yang, Yizhuo Yuan, Shenghai Yang, Jianfei Nguyen, Thien Hoang Cao, Muqing Nguyen, Thien-Minh Wang, Han Xie, Lihua School of Electrical and Electronic Engineering Engineering Audio-visual fusion Anti-UAV In response to the evolving challenges posed by small unmanned aerial vehicles (UAVs), which have the potential to transport harmful payloads or cause significant damage, we present AV-FDTI, an innovative Audio-Visual Fusion system designed for Drone Threat Identification. AV-FDTI leverages the fusion of audio and omnidirectional camera feature inputs, providing a comprehensive solution to enhance the precision and resilience of drone classification and 3D localization. Specifically, AV-FDTI employs a CRNN network to capture vital temporal dynamics within the audio domain and utilizes a pretrained ResNet50 model for image feature extraction. Furthermore, we adopt a visual information entropy and cross-attention-based mechanism to enhance the fusion of visual and audio data. Notably, our system is trained based on automated Leica tracking annotations, offering accurate ground truth data with millimeter-level accuracy. Comprehensive comparative evaluations demonstrate the superiority of our solution over the existing systems. In our commitment to advancing this field, we will release this work as open-source code and wearable AV-FDTI design, contributing valuable resources to the research community. Agency for Science, Technology and Research (A*STAR) National Research Foundation (NRF) Published version This research is supported by the National Research Foundation, Singapore, under its Medium-Sized Center for Advanced Robotics Technology Innovation (CARTIN) and under project WP5 within the Delta-NTU Corporate Lab with funding support from A*STAR under its IAF-ICP program (Grant no: I2201E0013) and Delta Electronics Inc. 2024-11-27T06:39:38Z 2024-11-27T06:39:38Z 2024 Journal Article Yang, Y., Yuan, S., Yang, J., Nguyen, T. H., Cao, M., Nguyen, T., Wang, H. & Xie, L. (2024). AV-FDTI: audio-visual fusion for drone threat identification. Journal of Automation and Intelligence, 3(3), 144-151. https://dx.doi.org/10.1016/j.jai.2024.06.002 2949-8554 https://hdl.handle.net/10356/181381 10.1016/j.jai.2024.06.002 2-s2.0-85199535899 3 3 144 151 en I2201E0013 WP5 Journal of Automation and Intelligence © 2024 The Authors. Published by Elsevier B.V. on behalf of KeAi Communications Co. Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering
Audio-visual fusion
Anti-UAV
spellingShingle Engineering
Audio-visual fusion
Anti-UAV
Yang, Yizhuo
Yuan, Shenghai
Yang, Jianfei
Nguyen, Thien Hoang
Cao, Muqing
Nguyen, Thien-Minh
Wang, Han
Xie, Lihua
AV-FDTI: audio-visual fusion for drone threat identification
description In response to the evolving challenges posed by small unmanned aerial vehicles (UAVs), which have the potential to transport harmful payloads or cause significant damage, we present AV-FDTI, an innovative Audio-Visual Fusion system designed for Drone Threat Identification. AV-FDTI leverages the fusion of audio and omnidirectional camera feature inputs, providing a comprehensive solution to enhance the precision and resilience of drone classification and 3D localization. Specifically, AV-FDTI employs a CRNN network to capture vital temporal dynamics within the audio domain and utilizes a pretrained ResNet50 model for image feature extraction. Furthermore, we adopt a visual information entropy and cross-attention-based mechanism to enhance the fusion of visual and audio data. Notably, our system is trained based on automated Leica tracking annotations, offering accurate ground truth data with millimeter-level accuracy. Comprehensive comparative evaluations demonstrate the superiority of our solution over the existing systems. In our commitment to advancing this field, we will release this work as open-source code and wearable AV-FDTI design, contributing valuable resources to the research community.
author2 School of Electrical and Electronic Engineering
author_facet School of Electrical and Electronic Engineering
Yang, Yizhuo
Yuan, Shenghai
Yang, Jianfei
Nguyen, Thien Hoang
Cao, Muqing
Nguyen, Thien-Minh
Wang, Han
Xie, Lihua
format Article
author Yang, Yizhuo
Yuan, Shenghai
Yang, Jianfei
Nguyen, Thien Hoang
Cao, Muqing
Nguyen, Thien-Minh
Wang, Han
Xie, Lihua
author_sort Yang, Yizhuo
title AV-FDTI: audio-visual fusion for drone threat identification
title_short AV-FDTI: audio-visual fusion for drone threat identification
title_full AV-FDTI: audio-visual fusion for drone threat identification
title_fullStr AV-FDTI: audio-visual fusion for drone threat identification
title_full_unstemmed AV-FDTI: audio-visual fusion for drone threat identification
title_sort av-fdti: audio-visual fusion for drone threat identification
publishDate 2024
url https://hdl.handle.net/10356/181381
_version_ 1819112970005774336