Two-layer EM algorithm for ALD mixture regression models: A new solution to composite quantile regression

We advocate linear regression by modeling the error term through a finite mixture of asymmetric Laplace distributions (ALDs). The model expands the flexibility of linear regression to account for heterogeneity among data and allows us to establish the equivalence between maximum likelihood estimatio...

Full description

Saved in:
Bibliographic Details
Main Authors: Wang, Shangshan, Xiang, Liming
Other Authors: School of Physical and Mathematical Sciences
Format: Article
Language:English
Published: 2017
Subjects:
Online Access:https://hdl.handle.net/10356/83377
http://hdl.handle.net/10220/43534
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:We advocate linear regression by modeling the error term through a finite mixture of asymmetric Laplace distributions (ALDs). The model expands the flexibility of linear regression to account for heterogeneity among data and allows us to establish the equivalence between maximum likelihood estimation of the model parameters and the composite quantile regression (CQR) estimation developed by Zou and Yuan (Ann. Stat. 36:1108–1126, 2008), providing a new likelihood-based solution to CQR. Particularly, we develop a computationally efficient estimation procedure via a two-layer EM algorithm, where the first layer EM algorithm incorporates missing information from the component memberships of the mixture model and nests the second layer EM in its M-step to accommodate latent variables involved in the location-scale mixture representation of the ALD. An appealing feature of the proposed algorithm is that the closed form updates for parameters in each iteration are obtained explicitly, instead of resorting to numerical optimization methods as in the existing work. Computational complexity can be reduced significantly. We evaluate the performance through simulation studies and illustrate its usefulness by analyzing a gene expression dataset.