Model-building with multiply imputed data

Model selection is well-known for introducing additional uncertainty which can be more severe in the presence of missing data. Model averaging is an alternative to model selection which is intended to overcome the under-estimation of standard errors that is a consequence of model selection....

Full description

Saved in:

Bibliographic Details
Main Authors:	Pillay, Khuneswari Gopal, H. McColl, John
Format:	Book Section
Language:	English
Published:	Penerbit UTHM 2018
Subjects:	Q350-390 Information theory
Online Access:	http://eprints.uthm.edu.my/6943/1/C1549_0b9151676729c6eba06fc2274989e56c.pdf http://eprints.uthm.edu.my/6943/
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Universiti Tun Hussein Onn Malaysia
Language:	English

id	my.uthm.eprints.6943
record_format	eprints
spelling	my.uthm.eprints.69432022-04-17T07:10:58Z http://eprints.uthm.edu.my/6943/ Model-building with multiply imputed data Pillay, Khuneswari Gopal H. McColl, John Q350-390 Information theory Model selection is well-known for introducing additional uncertainty which can be more severe in the presence of missing data. Model averaging is an alternative to model selection which is intended to overcome the under-estimation of standard errors that is a consequence of model selection. Model selection and model averaging were explored on multiply-imputed data sets in terms of model selection and prediction. Three different model selection approaches (RR, STACK and M-STACK) and model averaging using three model-building strategies (non-overlapping variable sets, inclusive and restrictive strategies) to combine results from multiply-imputed data sets were explored using a basic Monte Carlo simulation study on linear and generalized linear models. The results showed that the STACK method performs better than RR and M-STACK in terms of model selection and prediction, whereas model averaging performs slightly better than STACK in terms of prediction. The inclusive and restrictive strategies perform better in terms of prediction but non-overlapping variable sets performs better for model selection. In conclusion, researchers should use STACK (with non-overlapping variable sets) for analysing data with missing values to determine which variables to include when making predictions but use model averaging (with a restrictive strategy) for prediction. Penerbit UTHM 2018 Book Section PeerReviewed text en http://eprints.uthm.edu.my/6943/1/C1549_0b9151676729c6eba06fc2274989e56c.pdf Pillay, Khuneswari Gopal and H. McColl, John (2018) Model-building with multiply imputed data. In: A Letter on Applications of Mathematics and Statistics. Penerbit UTHM, pp. 29-51. ISBN 978-967-2216-06-3
institution	Universiti Tun Hussein Onn Malaysia
building	UTHM Library
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Tun Hussein Onn Malaysia
content_source	UTHM Institutional Repository
url_provider	http://eprints.uthm.edu.my/
language	English
topic	Q350-390 Information theory
spellingShingle	Q350-390 Information theory Pillay, Khuneswari Gopal H. McColl, John Model-building with multiply imputed data
description	Model selection is well-known for introducing additional uncertainty which can be more severe in the presence of missing data. Model averaging is an alternative to model selection which is intended to overcome the under-estimation of standard errors that is a consequence of model selection. Model selection and model averaging were explored on multiply-imputed data sets in terms of model selection and prediction. Three different model selection approaches (RR, STACK and M-STACK) and model averaging using three model-building strategies (non-overlapping variable sets, inclusive and restrictive strategies) to combine results from multiply-imputed data sets were explored using a basic Monte Carlo simulation study on linear and generalized linear models. The results showed that the STACK method performs better than RR and M-STACK in terms of model selection and prediction, whereas model averaging performs slightly better than STACK in terms of prediction. The inclusive and restrictive strategies perform better in terms of prediction but non-overlapping variable sets performs better for model selection. In conclusion, researchers should use STACK (with non-overlapping variable sets) for analysing data with missing values to determine which variables to include when making predictions but use model averaging (with a restrictive strategy) for prediction.
format	Book Section
author	Pillay, Khuneswari Gopal H. McColl, John
author_facet	Pillay, Khuneswari Gopal H. McColl, John
author_sort	Pillay, Khuneswari Gopal
title	Model-building with multiply imputed data
title_short	Model-building with multiply imputed data
title_full	Model-building with multiply imputed data
title_fullStr	Model-building with multiply imputed data
title_full_unstemmed	Model-building with multiply imputed data
title_sort	model-building with multiply imputed data
publisher	Penerbit UTHM
publishDate	2018
url	http://eprints.uthm.edu.my/6943/1/C1549_0b9151676729c6eba06fc2274989e56c.pdf http://eprints.uthm.edu.my/6943/
_version_	1738581555301842944

Model-building with multiply imputed data

Similar Items