Robust Random Regression Imputation method for missing data in the presence of outliers
The Ordinary Least Square (OLS) estimator is the best regression estimator if all the assumptions are met. However, the presence of missing data and outliers can distort the Ordinary Least Squares estimation and increase the variability of the parameters estimates. The main focus of this research i...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Language: | English |
Published: |
2013
|
Online Access: | http://psasir.upm.edu.my/id/eprint/49818/1/FS%202013%2042RR.pdf http://psasir.upm.edu.my/id/eprint/49818/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Universiti Putra Malaysia |
Language: | English |
Summary: | The Ordinary Least Square (OLS) estimator is the best regression estimator if all the assumptions are met. However, the presence of missing data and outliers can distort the Ordinary Least Squares estimation and increase the variability of the parameters estimates. The
main focus of this research is to take remedial measure in missing data in regression in the presence of outliers. In regression analysis, the dependent variable (Y) is a function of the independent variable X. Thus, in
regression, outliers and missing values can come in both X and Y directions. It is very common to use the OLS base Random Regression Imputation (RRI) when missing values are in Y direction. This RRI seems to be a good method
if there are no outliers in the data. Unfortunately, this estimate performs poorly in the presence of outliers. It is because the RRI is OLS base imputation method and OLS is largely affected by outliers. As such, we modified an
OLS base Random Regression Imputation (RRRI) methods by
incorporating the robust MM estimate which is less affected by outliers. The proposed method is compared with some well-known methods of estimating missing data. The results of the study signify that the RRRI method outperforms the existing methods in the presence of outliers. Since in
regression, outliers and missing data can come in both directions, we also considered a situation in which observations are missing in the X explanatory variable. In this respect, the Dummy Variable (DV) approach is
one of the best approaches to predict the missing data model. However, this approach also becomes poor in the presence of outliers. As an alternative, Robust Inverse Regression Technique is proposed to get the better
estimate. By examining the real data and Monte Carlo Simulation studies, it revealed that our proposed robust methods perform better than the classical methods. |
---|