GAUSSIAN PROCESS REGRESSION MODEL FOR BIG DATA
Regression analysis is used to learn the relationship between variables that are predictors and responses. Gaussian process regression is one of regression analysis based on the Bayesian method and has a non-parametric characteristic. The Gaussian process regression is widely known to have a grea...
Saved in:
Main Author: | |
---|---|
Format: | Final Project |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/38842 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
id |
id-itb.:38842 |
---|---|
spelling |
id-itb.:388422019-06-18T15:01:26ZGAUSSIAN PROCESS REGRESSION MODEL FOR BIG DATA Kevin Indonesia Final Project Regression, Gaussian Process, Stochatic Variational Inference, Variational Learning, Stochastic Optimization. INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/38842 Regression analysis is used to learn the relationship between variables that are predictors and responses. Gaussian process regression is one of regression analysis based on the Bayesian method and has a non-parametric characteristic. The Gaussian process regression is widely known to have a great performance in capturing the relationship between predictors and responses. However, the major pain point of the Gaussian process is its computational complexity which has the O(n3). One of the methods to reduce the complexity is by taking advantage of the variational approach which is then called as VAR-SPGP regression with O(nm2) computational complexity where m < n. However, it is not enough to be able to solve big-sized data. Further development of VAR-SPGP regression is SVI-GP regression which able to reduce the computational complexity to O(m3). In this paper, full Gaussian process, VAR-SPGP, and SVI-GP regression will be explained sequentially. Then, the effect of each covariance function’s hyperparameters will also be discussed. However, the implementation will only use the exponential quadratic kernel. Furthermore, the implementation of gradient-based optimization as well as the stochastic optimization (Adadelta) for SVI-GP will also be discussed. From the simulation result, VAR-SPGP and SVI-GP able to perform well and succeed to reduce the computational time and memory. Next, to validate the model performance and the approach that is used, the model will be tested by using the California Housing (1990) and Beijing Housing (2011 to 2017). The result, SVI-GP Gaussian process regression is able to approximate the Gaussian process regression quite well on those stated problems. text |
institution |
Institut Teknologi Bandung |
building |
Institut Teknologi Bandung Library |
continent |
Asia |
country |
Indonesia Indonesia |
content_provider |
Institut Teknologi Bandung |
collection |
Digital ITB |
language |
Indonesia |
description |
Regression analysis is used to learn the relationship between variables that are
predictors and responses. Gaussian process regression is one of regression analysis
based on the Bayesian method and has a non-parametric characteristic. The
Gaussian process regression is widely known to have a great performance in
capturing the relationship between predictors and responses. However, the major
pain point of the Gaussian process is its computational complexity which has the
O(n3). One of the methods to reduce the complexity is by taking advantage of the
variational approach which is then called as VAR-SPGP regression with O(nm2)
computational complexity where m < n. However, it is not enough to be able
to solve big-sized data. Further development of VAR-SPGP regression is SVI-GP
regression which able to reduce the computational complexity to O(m3). In this
paper, full Gaussian process, VAR-SPGP, and SVI-GP regression will be explained
sequentially. Then, the effect of each covariance function’s hyperparameters will
also be discussed. However, the implementation will only use the exponential
quadratic kernel. Furthermore, the implementation of gradient-based optimization
as well as the stochastic optimization (Adadelta) for SVI-GP will also be discussed.
From the simulation result, VAR-SPGP and SVI-GP able to perform well and
succeed to reduce the computational time and memory. Next, to validate the model
performance and the approach that is used, the model will be tested by using the
California Housing (1990) and Beijing Housing (2011 to 2017). The result, SVI-GP
Gaussian process regression is able to approximate the Gaussian process regression
quite well on those stated problems. |
format |
Final Project |
author |
Kevin |
spellingShingle |
Kevin GAUSSIAN PROCESS REGRESSION MODEL FOR BIG DATA |
author_facet |
Kevin |
author_sort |
Kevin |
title |
GAUSSIAN PROCESS REGRESSION MODEL FOR BIG DATA |
title_short |
GAUSSIAN PROCESS REGRESSION MODEL FOR BIG DATA |
title_full |
GAUSSIAN PROCESS REGRESSION MODEL FOR BIG DATA |
title_fullStr |
GAUSSIAN PROCESS REGRESSION MODEL FOR BIG DATA |
title_full_unstemmed |
GAUSSIAN PROCESS REGRESSION MODEL FOR BIG DATA |
title_sort |
gaussian process regression model for big data |
url |
https://digilib.itb.ac.id/gdl/view/38842 |
_version_ |
1821997616292954112 |