GAUSSIAN PROCESS REGRESSION MODEL FOR BIG DATA

Regression analysis is used to learn the relationship between variables that are predictors and responses. Gaussian process regression is one of regression analysis based on the Bayesian method and has a non-parametric characteristic. The Gaussian process regression is widely known to have a grea...

Full description

Saved in:
Bibliographic Details
Main Author: Kevin
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/38842
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:Regression analysis is used to learn the relationship between variables that are predictors and responses. Gaussian process regression is one of regression analysis based on the Bayesian method and has a non-parametric characteristic. The Gaussian process regression is widely known to have a great performance in capturing the relationship between predictors and responses. However, the major pain point of the Gaussian process is its computational complexity which has the O(n3). One of the methods to reduce the complexity is by taking advantage of the variational approach which is then called as VAR-SPGP regression with O(nm2) computational complexity where m < n. However, it is not enough to be able to solve big-sized data. Further development of VAR-SPGP regression is SVI-GP regression which able to reduce the computational complexity to O(m3). In this paper, full Gaussian process, VAR-SPGP, and SVI-GP regression will be explained sequentially. Then, the effect of each covariance function’s hyperparameters will also be discussed. However, the implementation will only use the exponential quadratic kernel. Furthermore, the implementation of gradient-based optimization as well as the stochastic optimization (Adadelta) for SVI-GP will also be discussed. From the simulation result, VAR-SPGP and SVI-GP able to perform well and succeed to reduce the computational time and memory. Next, to validate the model performance and the approach that is used, the model will be tested by using the California Housing (1990) and Beijing Housing (2011 to 2017). The result, SVI-GP Gaussian process regression is able to approximate the Gaussian process regression quite well on those stated problems.