Distributed machine learning on IAAS clouds
Training complex machine learning (ML) models with large datasets requires powerful computing infrastructure, which is costly to acquire and maintain. As a result, ML researchers turn to the cloud for on-demand and elastic resource provisioning capabilities. Two issues have arisen from this trend: 1...
Saved in:
Main Authors: | , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2018
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/4832 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
Summary: | Training complex machine learning (ML) models with large datasets requires powerful computing infrastructure, which is costly to acquire and maintain. As a result, ML researchers turn to the cloud for on-demand and elastic resource provisioning capabilities. Two issues have arisen from this trend: 1) if not configured properly, training ML models on the cloud could incur significant cost and time, and 2) many researchers in ML tend to focus more on model and algorithm development, so they may not have enough time or skills to deal with system setup, resource selection and configuration. In this work, we propose and implement FC 2 : a web service for fast, convenient and cost-effective distributed ML model training over public cloud resource. Central to the effectiveness of FC 2 is the ability to recommend an appropriate resource configuration in terms of cost and execution time for a given ML training task. Extensive experiments with real-world deep neural network models and datasets demonstrate the effectiveness of our solution. |
---|