Comparison between AI local execution and cloud offloading for AIoT
The rapid evolution of Artificial Intelligence (AI) and the Internet of Things (IoT) has led to the development of AIoT (Artificial Intelligence of Things), where AI empowers IoT devices with intelligent processing and autonomous decision-making capabilities. While AIoT systems drive innovatio...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/181127 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | The rapid evolution of Artificial Intelligence (AI) and the Internet of Things (IoT) has led to the
development of AIoT (Artificial Intelligence of Things), where AI empowers IoT devices with
intelligent processing and autonomous decision-making capabilities. While AIoT systems drive
innovation across industries like healthcare, smart homes, and autonomous vehicles, they
face significant challenges related to computational resources, energy efficiency, and latency.
To address these limitations, AI offloading to the cloud has been a common solution, although
it introduces concerns like network latency, data privacy risks, and energy consumption.
This research investigates whether running a reduced AI model locally on resource
constrained IoT devices can achieve performance levels comparable to cloud-based AI
offloading. Specifically, the study aims to determine the minimum memory threshold required
for local execution to match the speed and accuracy of cloud inference. Machine learning
models such as MobileNetV2, EfficientNetV2B2, and DenseNet121 were employed for image
classification across different datasets, using various quantization techniques, including
dynamic range, full integer, and INT16 quantization.
The results show that local inference with smaller models, like MobileNet, generally
outperforms cloud inference in terms of latency with minor differences in accuracy. For larger
models, like DenseNet, compressed versions for local and cloud inference achieved similar
speeds and accuracy, though the uncompressed base model performed better in the cloud.
The study also highlights the varying effects of compression techniques, with Dynamic Range
Quantization offering the smallest model size but impacting latency significantly on MobileNet.
In conclusion, as model size and complexity grow, cloud inference may become more
advantageous, and thorough validation is essential before deploying compressed models in
real-world applications. |
---|