Machine learning for industrial IOT
Within trusted silos, data sharing may be permitted. The trade-off between running FedAvg and data sharing is largely unexplored in various contexts. This paper’s goal is to perform an exhaustive search to explore the best option in improving communication efficiency by maximizing inference accurac...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2023
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/166064 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-166064 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1660642023-04-21T15:39:06Z Machine learning for industrial IOT Yeow, Brandon Wei Liang Anupam Chattopadhyay School of Computer Science and Engineering anupam@ntu.edu.sg Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Within trusted silos, data sharing may be permitted. The trade-off between running FedAvg and data sharing is largely unexplored in various contexts. This paper’s goal is to perform an exhaustive search to explore the best option in improving communication efficiency by maximizing inference accuracy and reducing communication costs. Various data set distributions across clients are induced through data augmentation techniques and sharding by labels. Contexts that we tested include client skew, client count, i.i.d clients, pathological non-i.i.d clients, non-pathological non-i.i.d clients, various stages of training through pre-trained backbones, size of networks through varied backbones, and synthetic data generation. We concluded that running successive rounds of FedAvg is key, but sharing data results in a higher accuracy at each epoch in almost all contexts. This comes with a trade-off of higher bandwidth cost and local training times. Quantity skew impacts models only if there is too little or too much data. This paper also explores data-level privacy techniques using generative models such as Variational Autoencoder. Visual quality of generated images shows little impact on the increase in accuracy. Synthetic data, regenerated or sampled, show a significant improvement over simply sharing data. The insights are consolidated in a table to prescribe the best decision to take in various scenarios. This paper concludes by proposing an algorithm for peer-to-peer Federated Learning where clients search for peers up to the th degree and perform the best actions with peers under bandwidth constraints. The algorithm preferentially chooses to run FedAvg to reduce bandwidth cost unless historically this peer has not provided significant improvement. Sharing data is done when there is little improvement left for FedAvg to achieve and we have sufficient bandwidth for it. Bachelor of Engineering (Computer Science) 2023-04-19T04:47:26Z 2023-04-19T04:47:26Z 2023 Final Year Project (FYP) Yeow, B. W. L. (2023). Machine learning for industrial IOT. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/166064 https://hdl.handle.net/10356/166064 en SCSE22-0023 application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence |
spellingShingle |
Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Yeow, Brandon Wei Liang Machine learning for industrial IOT |
description |
Within trusted silos, data sharing may be permitted. The trade-off between running
FedAvg and data sharing is largely unexplored in various contexts. This paper’s goal is to perform an exhaustive search to explore the best option in improving communication efficiency by maximizing inference accuracy and reducing communication costs.
Various data set distributions across clients are induced through data augmentation techniques and sharding by labels. Contexts that we tested include client skew, client count, i.i.d clients, pathological non-i.i.d clients, non-pathological non-i.i.d clients, various stages of training through pre-trained backbones, size of networks through varied backbones, and synthetic data generation. We concluded that running successive rounds of FedAvg is key, but sharing data results in a higher accuracy at each epoch in almost all contexts. This comes with a trade-off of higher bandwidth cost and local training times. Quantity skew impacts models only if there is too little or too much data. This paper also explores data-level privacy techniques using generative models such as Variational Autoencoder. Visual quality of generated images shows little impact on the increase in accuracy. Synthetic data, regenerated or sampled, show a significant improvement over simply sharing data. The insights are consolidated in a table to prescribe the best decision to take in various scenarios. This paper concludes by proposing an algorithm for peer-to-peer Federated Learning where clients search for
peers up to the th degree and perform the best actions with peers under bandwidth constraints. The algorithm preferentially chooses to run FedAvg to reduce bandwidth cost unless historically this peer has not provided significant improvement. Sharing data is done when there is little improvement left for FedAvg to achieve and we have sufficient bandwidth for it. |
author2 |
Anupam Chattopadhyay |
author_facet |
Anupam Chattopadhyay Yeow, Brandon Wei Liang |
format |
Final Year Project |
author |
Yeow, Brandon Wei Liang |
author_sort |
Yeow, Brandon Wei Liang |
title |
Machine learning for industrial IOT |
title_short |
Machine learning for industrial IOT |
title_full |
Machine learning for industrial IOT |
title_fullStr |
Machine learning for industrial IOT |
title_full_unstemmed |
Machine learning for industrial IOT |
title_sort |
machine learning for industrial iot |
publisher |
Nanyang Technological University |
publishDate |
2023 |
url |
https://hdl.handle.net/10356/166064 |
_version_ |
1764208173566132224 |