GAUSSIAN PROCESS REGRESSION MODEL FOR BIG DATA

Regression analysis is used to learn the relationship between variables that are predictors and responses. Gaussian process regression is one of regression analysis based on the Bayesian method and has a non-parametric characteristic. The Gaussian process regression is widely known to have a grea...

Full description

Saved in:
Bibliographic Details
Main Author: Kevin
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/38842
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
id id-itb.:38842
spelling id-itb.:388422019-06-18T15:01:26ZGAUSSIAN PROCESS REGRESSION MODEL FOR BIG DATA Kevin Indonesia Final Project Regression, Gaussian Process, Stochatic Variational Inference, Variational Learning, Stochastic Optimization. INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/38842 Regression analysis is used to learn the relationship between variables that are predictors and responses. Gaussian process regression is one of regression analysis based on the Bayesian method and has a non-parametric characteristic. The Gaussian process regression is widely known to have a great performance in capturing the relationship between predictors and responses. However, the major pain point of the Gaussian process is its computational complexity which has the O(n3). One of the methods to reduce the complexity is by taking advantage of the variational approach which is then called as VAR-SPGP regression with O(nm2) computational complexity where m < n. However, it is not enough to be able to solve big-sized data. Further development of VAR-SPGP regression is SVI-GP regression which able to reduce the computational complexity to O(m3). In this paper, full Gaussian process, VAR-SPGP, and SVI-GP regression will be explained sequentially. Then, the effect of each covariance function’s hyperparameters will also be discussed. However, the implementation will only use the exponential quadratic kernel. Furthermore, the implementation of gradient-based optimization as well as the stochastic optimization (Adadelta) for SVI-GP will also be discussed. From the simulation result, VAR-SPGP and SVI-GP able to perform well and succeed to reduce the computational time and memory. Next, to validate the model performance and the approach that is used, the model will be tested by using the California Housing (1990) and Beijing Housing (2011 to 2017). The result, SVI-GP Gaussian process regression is able to approximate the Gaussian process regression quite well on those stated problems. text
institution Institut Teknologi Bandung
building Institut Teknologi Bandung Library
continent Asia
country Indonesia
Indonesia
content_provider Institut Teknologi Bandung
collection Digital ITB
language Indonesia
description Regression analysis is used to learn the relationship between variables that are predictors and responses. Gaussian process regression is one of regression analysis based on the Bayesian method and has a non-parametric characteristic. The Gaussian process regression is widely known to have a great performance in capturing the relationship between predictors and responses. However, the major pain point of the Gaussian process is its computational complexity which has the O(n3). One of the methods to reduce the complexity is by taking advantage of the variational approach which is then called as VAR-SPGP regression with O(nm2) computational complexity where m < n. However, it is not enough to be able to solve big-sized data. Further development of VAR-SPGP regression is SVI-GP regression which able to reduce the computational complexity to O(m3). In this paper, full Gaussian process, VAR-SPGP, and SVI-GP regression will be explained sequentially. Then, the effect of each covariance function’s hyperparameters will also be discussed. However, the implementation will only use the exponential quadratic kernel. Furthermore, the implementation of gradient-based optimization as well as the stochastic optimization (Adadelta) for SVI-GP will also be discussed. From the simulation result, VAR-SPGP and SVI-GP able to perform well and succeed to reduce the computational time and memory. Next, to validate the model performance and the approach that is used, the model will be tested by using the California Housing (1990) and Beijing Housing (2011 to 2017). The result, SVI-GP Gaussian process regression is able to approximate the Gaussian process regression quite well on those stated problems.
format Final Project
author Kevin
spellingShingle Kevin
GAUSSIAN PROCESS REGRESSION MODEL FOR BIG DATA
author_facet Kevin
author_sort Kevin
title GAUSSIAN PROCESS REGRESSION MODEL FOR BIG DATA
title_short GAUSSIAN PROCESS REGRESSION MODEL FOR BIG DATA
title_full GAUSSIAN PROCESS REGRESSION MODEL FOR BIG DATA
title_fullStr GAUSSIAN PROCESS REGRESSION MODEL FOR BIG DATA
title_full_unstemmed GAUSSIAN PROCESS REGRESSION MODEL FOR BIG DATA
title_sort gaussian process regression model for big data
url https://digilib.itb.ac.id/gdl/view/38842
_version_ 1821997616292954112