Sensor fusion for autonomous mobile robot

The sparsity of point clouds and lack of sufficient semantic information present significant challenges for existing LiDAR-only 3D detection methods, particularly in robotic applications that demand high accuracy and efficiency. To address point cloud sparsity, recent approaches have explored the co...

وصف كامل

محفوظ في:

التفاصيل البيبلوغرافية
المؤلف الرئيسي:	Yu, Zhuochen
مؤلفون آخرون:	Andy Khong W H
التنسيق:	Thesis-Master by Coursework
اللغة:	English
منشور في:	Nanyang Technological University 2024
الموضوعات:	Engineering Autonomous mobile robot
الوصول للمادة أونلاين:	https://hdl.handle.net/10356/181815
الوسوم:	إضافة وسم لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
المؤسسة:	Nanyang Technological University
اللغة:	English

الوصف
الملخص:	The sparsity of point clouds and lack of sufficient semantic information present significant challenges for existing LiDAR-only 3D detection methods, particularly in robotic applications that demand high accuracy and efficiency. To address point cloud sparsity, recent approaches have explored the conversion of RGB images into virtual points through depth completion, enabling fusion with LiDAR data. While these methods improve point cloud density, they often introduce substantial computational overhead due to the high density of generated virtual points and do not fully exploit the rich semantic information from images. In this work, VKIFNet is introduced as an efficient multi-modal feature fusion framework designed to enhance 3D perception for robotic systems by integrating virtual key instances with LiDAR points across multiple stages. VKIFNet incor porates three core modules. First, SKIS (Semantic Key Instance Selection) is presented, which filters and preserves only essential virtual key instances while leveraging semantic information from virtual points. This approach significantly reduces computational demands and allows critical image-derived features to be retained in 3D space, crucial for efficient robotic operation. The second module is a new fusion technique called VIFF (Virtual Instance Focused Fusion), which performs multi-level fusion of Virtual Key Instances and LiDAR data in both BEV (Bird’s-Eye View) and 3D space. This fusion method enhances spatial awareness and ensures that both LiDAR and image-derived features contribute to a more robust understanding of the environment. Lastly, VIRA (Virtual-Instance-to-Real Attention) is introduced as a lightweight attention mechanism that utilizes the features of relevant LiDAR points to refine the virtual key instances with minimal computational overhead, optimizing the model for real-time robotic applications. VKIFNet demonstrates substantial improvements in detection performance on the KITTI and JRDB datasets, showcasing its potential for high-precision 3D perception in robotics.

Sensor fusion for autonomous mobile robot

مواد مشابهة