Optimization for efficient data communication in distributed machine training system

The rising trend of deep learning causes the complexity and scale of machine learning to increase exponentially. But, the complexity is limited by hardware processing speed. To solve the issue, there are a few machine learning frameworks online, which support distributed training on multiple nodes....

Full description

Saved in:
Bibliographic Details
Main Author: Gan, Hsien Yan
Other Authors: Wen Yonggang
Format: Final Year Project
Language:English
Published: 2017
Subjects:
Online Access:http://hdl.handle.net/10356/72923
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:The rising trend of deep learning causes the complexity and scale of machine learning to increase exponentially. But, the complexity is limited by hardware processing speed. To solve the issue, there are a few machine learning frameworks online, which support distributed training on multiple nodes. Compared to interprocess communication, data exchange between nodes is relatively slow, high latency and high overhead cost. When the network link is shared among multiple nodes, limited bandwidth arises, which is a more undesirable property. This project is to minimize the data flow between nodes by adding a data filter and Snappy compression. The filter reduces the unnecessary data flow while the Snappy does data compression to reduce bandwidth consumption. This implementation successfully reduces the data flow to 8 percent and decrease training time to 76 percent. Due to the low required bandwidth, distributed system on different geographical area and hardware such as a mobile laptop is possible.