Comparison between AI local execution and cloud offloading for AIoT

The rapid evolution of Artificial Intelligence (AI) and the Internet of Things (IoT) has led to the development of AIoT (Artificial Intelligence of Things), where AI empowers IoT devices with intelligent processing and autonomous decision-making capabilities. While AIoT systems drive innovatio...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Quek, Wei Quan
مؤلفون آخرون: Tan Rui
التنسيق: Final Year Project
اللغة:English
منشور في: Nanyang Technological University 2024
الموضوعات:
الوصول للمادة أونلاين:https://hdl.handle.net/10356/181127
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
المؤسسة: Nanyang Technological University
اللغة: English
الوصف
الملخص:The rapid evolution of Artificial Intelligence (AI) and the Internet of Things (IoT) has led to the development of AIoT (Artificial Intelligence of Things), where AI empowers IoT devices with intelligent processing and autonomous decision-making capabilities. While AIoT systems drive innovation across industries like healthcare, smart homes, and autonomous vehicles, they face significant challenges related to computational resources, energy efficiency, and latency. To address these limitations, AI offloading to the cloud has been a common solution, although it introduces concerns like network latency, data privacy risks, and energy consumption. This research investigates whether running a reduced AI model locally on resource constrained IoT devices can achieve performance levels comparable to cloud-based AI offloading. Specifically, the study aims to determine the minimum memory threshold required for local execution to match the speed and accuracy of cloud inference. Machine learning models such as MobileNetV2, EfficientNetV2B2, and DenseNet121 were employed for image classification across different datasets, using various quantization techniques, including dynamic range, full integer, and INT16 quantization. The results show that local inference with smaller models, like MobileNet, generally outperforms cloud inference in terms of latency with minor differences in accuracy. For larger models, like DenseNet, compressed versions for local and cloud inference achieved similar speeds and accuracy, though the uncompressed base model performed better in the cloud. The study also highlights the varying effects of compression techniques, with Dynamic Range Quantization offering the smallest model size but impacting latency significantly on MobileNet. In conclusion, as model size and complexity grow, cloud inference may become more advantageous, and thorough validation is essential before deploying compressed models in real-world applications.