PARAMETRIC GAUSSIAN PROCESS REGRESSION FOR BIG DATA

The significant increase in data quantity in the modern era presents challenges for mathematical models, which become increasingly complex as data grows. This study proposes parametric Gaussian process regression as a solution, utilizing induction points or centroids from data clusters to reduce...

Full description

Saved in:
Bibliographic Details
Main Author: Rizky Kosasih, Kahfi
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/81467
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:The significant increase in data quantity in the modern era presents challenges for mathematical models, which become increasingly complex as data grows. This study proposes parametric Gaussian process regression as a solution, utilizing induction points or centroids from data clusters to reduce data quantity without sacrificing performance. The study aims to explore the potential of using centroids as observational data representations, validate the model through simulation, and implement it on synthetic and real data, specifically the average temperature from 12 weather stations on Java Island. The results indicate that the proposed model significantly outperforms standard resampling methods by up to 30% in terms of consistency and accuracy, using standardized mean squared error (SMSE), and effectively captures external factors through kernel combinations. In conclusion, employing induction points in Gaussian process regression transforms it into a parametric model with stable computational efficiency despite increasing observational data, and further exploration is recommended for various kernels and other observational data.