BAYESIANLINEARREGRESSIONFORPREDICTINGITB ALUMNIâS INCOME(CASESTUDYOFITBTRACER STUDY2021)
Everyyear,theDirectorateofStudentAffairsofITBconductsaTracerStudy surveyofITBgraduateswiththeaimofprovidingrecommendationstoITBin improvingtheeducationsystemandquality.Thesurveydatacanbeprocessed using apredictivemodelwithmachinelearningapproach,oneofwhichispredicting ITB alumni’ssalariesastheir...
Saved in:
Main Author: | |
---|---|
Format: | Final Project |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/65209 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
Summary: | Everyyear,theDirectorateofStudentAffairsofITBconductsaTracerStudy
surveyofITBgraduateswiththeaimofprovidingrecommendationstoITBin
improvingtheeducationsystemandquality.Thesurveydatacanbeprocessed
using apredictivemodelwithmachinelearningapproach,oneofwhichispredicting
ITB alumni’ssalariesastheircareerbuildingmilestoneaftercompletingtheir
education atITB.ThisFinalProjectaimstobuildaMultipleLinearRegression
model topredictthesalaryofITBGraduatesusingthe2021ITBTracerStudydata
whose columnshavebeenselected.Thedatausedinthisstudyconsistedof1815
observationsoffourteenpredictorvariablesandsalaryasresponsevariable.To
simplify themodelingprocessandtofulfillmodelassumptions,datapreprocessing
wasconductedsothatthedatareducedto1803observationsconsistingofeight
predictor variablesandoneresponsevariable.Furthermore,theMultipleLinear
RegressionmodelwassolvedbyusingBayesianstatisticalapproachtoapproximate
the posteriordistributionoftheparametermodel.Metropolis-HastingsAlgorithm
are usedasnumericalmethodwhichrequirestheparameterspriordistributionand
the likelihooddistributionofthedataasinput.Theposteriordistributionobtained
has ameanthatisquiteclosetotheestimatedparametervaluesobtainedusingthe
Ordinary LeastSquaremethodasfrequentiststatisticalapproach.Samplingcanbe
done fromtheposteriordistributiontomakepredictionsifanewdatasetisknown.
Prediction canbedonerepeatedlyusingdifferentsetofparametersobtainedfrom
sampling, sothatahistogramofpredictedvaluescanbegeneratedthatismore
informativethansinglepredictionusingtheOrdinaryLeastSquaremethod.The
proximity ofprediction’smeanfromthetruevaluearevariative. |
---|