PARAMETRIC GAUSSIAN PROCESS REGRESSION FOR BIG DATA
The significant increase in data quantity in the modern era presents challenges for mathematical models, which become increasingly complex as data grows. This study proposes parametric Gaussian process regression as a solution, utilizing induction points or centroids from data clusters to reduce...
Saved in:
Main Author: | |
---|---|
Format: | Final Project |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/81467 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
Summary: | The significant increase in data quantity in the modern era presents challenges for mathematical
models, which become increasingly complex as data grows. This study proposes
parametric Gaussian process regression as a solution, utilizing induction points or centroids
from data clusters to reduce data quantity without sacrificing performance. The study
aims to explore the potential of using centroids as observational data representations,
validate the model through simulation, and implement it on synthetic and real data,
specifically the average temperature from 12 weather stations on Java Island. The results
indicate that the proposed model significantly outperforms standard resampling methods
by up to 30% in terms of consistency and accuracy, using standardized mean squared
error (SMSE), and effectively captures external factors through kernel combinations. In
conclusion, employing induction points in Gaussian process regression transforms it into
a parametric model with stable computational efficiency despite increasing observational
data, and further exploration is recommended for various kernels and other observational
data. |
---|