Model-building with multiply imputed data
Model selection is well-known for introducing additional uncertainty which can be more severe in the presence of missing data. Model averaging is an alternative to model selection which is intended to overcome the under-estimation of standard errors that is a consequence of model selection....
Saved in:
Main Authors: | , |
---|---|
Format: | Book Section |
Language: | English |
Published: |
Penerbit UTHM
2018
|
Subjects: | |
Online Access: | http://eprints.uthm.edu.my/6943/1/C1549_0b9151676729c6eba06fc2274989e56c.pdf http://eprints.uthm.edu.my/6943/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Universiti Tun Hussein Onn Malaysia |
Language: | English |
id |
my.uthm.eprints.6943 |
---|---|
record_format |
eprints |
spelling |
my.uthm.eprints.69432022-04-17T07:10:58Z http://eprints.uthm.edu.my/6943/ Model-building with multiply imputed data Pillay, Khuneswari Gopal H. McColl, John Q350-390 Information theory Model selection is well-known for introducing additional uncertainty which can be more severe in the presence of missing data. Model averaging is an alternative to model selection which is intended to overcome the under-estimation of standard errors that is a consequence of model selection. Model selection and model averaging were explored on multiply-imputed data sets in terms of model selection and prediction. Three different model selection approaches (RR, STACK and M-STACK) and model averaging using three model-building strategies (non-overlapping variable sets, inclusive and restrictive strategies) to combine results from multiply-imputed data sets were explored using a basic Monte Carlo simulation study on linear and generalized linear models. The results showed that the STACK method performs better than RR and M-STACK in terms of model selection and prediction, whereas model averaging performs slightly better than STACK in terms of prediction. The inclusive and restrictive strategies perform better in terms of prediction but non-overlapping variable sets performs better for model selection. In conclusion, researchers should use STACK (with non-overlapping variable sets) for analysing data with missing values to determine which variables to include when making predictions but use model averaging (with a restrictive strategy) for prediction. Penerbit UTHM 2018 Book Section PeerReviewed text en http://eprints.uthm.edu.my/6943/1/C1549_0b9151676729c6eba06fc2274989e56c.pdf Pillay, Khuneswari Gopal and H. McColl, John (2018) Model-building with multiply imputed data. In: A Letter on Applications of Mathematics and Statistics. Penerbit UTHM, pp. 29-51. ISBN 978-967-2216-06-3 |
institution |
Universiti Tun Hussein Onn Malaysia |
building |
UTHM Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Tun Hussein Onn Malaysia |
content_source |
UTHM Institutional Repository |
url_provider |
http://eprints.uthm.edu.my/ |
language |
English |
topic |
Q350-390 Information theory |
spellingShingle |
Q350-390 Information theory Pillay, Khuneswari Gopal H. McColl, John Model-building with multiply imputed data |
description |
Model selection is well-known for introducing additional
uncertainty which can be more severe in the presence of missing
data. Model averaging is an alternative to model selection which is
intended to overcome the under-estimation of standard errors that is
a consequence of model selection. Model selection and model
averaging were explored on multiply-imputed data sets in terms of
model selection and prediction. Three different model selection
approaches (RR, STACK and M-STACK) and model averaging
using three model-building strategies (non-overlapping variable sets,
inclusive and restrictive strategies) to combine results from
multiply-imputed data sets were explored using a basic Monte Carlo
simulation study on linear and generalized linear models. The
results showed that the STACK method performs better than RR
and M-STACK in terms of model selection and prediction, whereas
model averaging performs slightly better than STACK in terms of
prediction. The inclusive and restrictive strategies perform better in
terms of prediction but non-overlapping variable sets performs
better for model selection. In conclusion, researchers should use
STACK (with non-overlapping variable sets) for analysing data with
missing values to determine which variables to include when
making predictions but use model averaging (with a restrictive
strategy) for prediction. |
format |
Book Section |
author |
Pillay, Khuneswari Gopal H. McColl, John |
author_facet |
Pillay, Khuneswari Gopal H. McColl, John |
author_sort |
Pillay, Khuneswari Gopal |
title |
Model-building with multiply imputed data |
title_short |
Model-building with multiply imputed data |
title_full |
Model-building with multiply imputed data |
title_fullStr |
Model-building with multiply imputed data |
title_full_unstemmed |
Model-building with multiply imputed data |
title_sort |
model-building with multiply imputed data |
publisher |
Penerbit UTHM |
publishDate |
2018 |
url |
http://eprints.uthm.edu.my/6943/1/C1549_0b9151676729c6eba06fc2274989e56c.pdf http://eprints.uthm.edu.my/6943/ |
_version_ |
1738581555301842944 |