Big data analytics
With the ease of access to connected devices and online services, data of a wide variety are constantly being collected by various service providers. These data can be used for trend-finding and the prediction of future values, such outcomes having an importance in optimization for a variety of indu...
محفوظ في:
المؤلف الرئيسي: | |
---|---|
مؤلفون آخرون: | |
التنسيق: | Final Year Project |
اللغة: | English |
منشور في: |
2019
|
الموضوعات: | |
الوصول للمادة أونلاين: | http://hdl.handle.net/10356/78353 |
الوسوم: |
إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
|
الملخص: | With the ease of access to connected devices and online services, data of a wide variety are constantly being collected by various service providers. These data can be used for trend-finding and the prediction of future values, such outcomes having an importance in optimization for a variety of industrial, commercial and even consumer processes. With the widespread availability of highly capable computing systems and programming tools, resource-intensive tasks like the implementation of predictive machine learning is now possible at low cost for a determined user. The objective of this project is to produce a machine learning-based process, capable of predicting a numerical output based upon a set of mixed-type input data. This process is implemented with open-source programming tools. Furthermore, the project also seeks to predict the relative importance of the different data features. In this project, we have developed a machine-learning process capable of predicting a numerical output with up to 0.77 explained variance. The process encompasses the entire data analysis procedure, from data importation, data pre-processing, hyperparameter optimization and prediction. The process developed shows that a functional, and reasonably accurate data analysis model, can be produced using open-source software. Using a variety of machine-learning algorithms, the project also shows the relative accuracy of, and time taken by the different models in producing a predicted output. |
---|