Comparison between AI local execution and cloud offloading for AIoT

The rapid evolution of Artificial Intelligence (AI) and the Internet of Things (IoT) has led to the development of AIoT (Artificial Intelligence of Things), where AI empowers IoT devices with intelligent processing and autonomous decision-making capabilities. While AIoT systems drive innovatio...

Full description

Saved in:
Bibliographic Details
Main Author: Quek, Wei Quan
Other Authors: Tan Rui
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/181127
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:The rapid evolution of Artificial Intelligence (AI) and the Internet of Things (IoT) has led to the development of AIoT (Artificial Intelligence of Things), where AI empowers IoT devices with intelligent processing and autonomous decision-making capabilities. While AIoT systems drive innovation across industries like healthcare, smart homes, and autonomous vehicles, they face significant challenges related to computational resources, energy efficiency, and latency. To address these limitations, AI offloading to the cloud has been a common solution, although it introduces concerns like network latency, data privacy risks, and energy consumption. This research investigates whether running a reduced AI model locally on resource constrained IoT devices can achieve performance levels comparable to cloud-based AI offloading. Specifically, the study aims to determine the minimum memory threshold required for local execution to match the speed and accuracy of cloud inference. Machine learning models such as MobileNetV2, EfficientNetV2B2, and DenseNet121 were employed for image classification across different datasets, using various quantization techniques, including dynamic range, full integer, and INT16 quantization. The results show that local inference with smaller models, like MobileNet, generally outperforms cloud inference in terms of latency with minor differences in accuracy. For larger models, like DenseNet, compressed versions for local and cloud inference achieved similar speeds and accuracy, though the uncompressed base model performed better in the cloud. The study also highlights the varying effects of compression techniques, with Dynamic Range Quantization offering the smallest model size but impacting latency significantly on MobileNet. In conclusion, as model size and complexity grow, cloud inference may become more advantageous, and thorough validation is essential before deploying compressed models in real-world applications.