Comparison of Robust Estimators’ Performance for Detecting Outliers in Multivariate Data

In multivariate data, outliers are difficult to detect especially when the dimension of the data increase. Mahalanobis distance (MD) has been one of the classical methods to detect outliers for multivariate data. However, the classical mean and covariance matrix in MD suffered from masking and swamp...

Full description

Saved in:
Bibliographic Details
Main Authors: Sharifah Sakinah, Syed Abd Mutalib, Siti Zanariah, Satari, Wan Nur Syahidah, Wan Yusoff
Format: Article
Language:English
Published: Universiti Teknologi MARA (UiTM) 2021
Subjects:
Online Access:http://umpir.ump.edu.my/id/eprint/32427/8/Comparison%20of%20Robust%20Estimators.pdf
http://umpir.ump.edu.my/id/eprint/32427/
https://ejournal.um.edu.my/index.php/JOSMA/article/view/32399
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Malaysia Pahang
Language: English
id my.ump.umpir.32427
record_format eprints
spelling my.ump.umpir.324272021-10-28T08:21:35Z http://umpir.ump.edu.my/id/eprint/32427/ Comparison of Robust Estimators’ Performance for Detecting Outliers in Multivariate Data Sharifah Sakinah, Syed Abd Mutalib Siti Zanariah, Satari Wan Nur Syahidah, Wan Yusoff QA Mathematics In multivariate data, outliers are difficult to detect especially when the dimension of the data increase. Mahalanobis distance (MD) has been one of the classical methods to detect outliers for multivariate data. However, the classical mean and covariance matrix in MD suffered from masking and swamping effects if the data contain outliers. Due to this problem, many studies used a robust estimator instead of the classical estimator of mean and covariance matrix. In this study, the performance of five robust estimators namely Fast Minimum Covariance Determinant (FMCD), Minimum Vector Variance (MVV), Covariance Matrix Equality (CME), Index Set Equality (ISE),and Test on Covariance (TOC) are investigated and compared. FMCD has been widely used and is known as among the best robust estimator. However, there are certain conditions that FMCD still lacks. MVV, CME, ISE and TOC are innovative of FMCD. These four robust estimators improve the last step of the FMCD algorithm. Hence, the objective of this study is to observe the performance of these five estimator to detect outliers in multivariate data particularly TOC as TOC is the latest robust estimator. Simulation studies are conducted for two outlier scenarios with various conditions. There are three performance measures, which are pout, pmask and pswamp used to measure the performance of the robust estimators. It is found that the TOC gives better performance in pswamp for most conditions. TOC gives better results for pout and pmask for certain conditions. Universiti Teknologi MARA (UiTM) 2021-10-15 Article PeerReviewed pdf en http://umpir.ump.edu.my/id/eprint/32427/8/Comparison%20of%20Robust%20Estimators.pdf Sharifah Sakinah, Syed Abd Mutalib and Siti Zanariah, Satari and Wan Nur Syahidah, Wan Yusoff (2021) Comparison of Robust Estimators’ Performance for Detecting Outliers in Multivariate Data. Journal of Statistical Modeling and Analytics, 3 (3). pp. 36-64. ISSN 2180-3102 https://ejournal.um.edu.my/index.php/JOSMA/article/view/32399
institution Universiti Malaysia Pahang
building UMP Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Malaysia Pahang
content_source UMP Institutional Repository
url_provider http://umpir.ump.edu.my/
language English
topic QA Mathematics
spellingShingle QA Mathematics
Sharifah Sakinah, Syed Abd Mutalib
Siti Zanariah, Satari
Wan Nur Syahidah, Wan Yusoff
Comparison of Robust Estimators’ Performance for Detecting Outliers in Multivariate Data
description In multivariate data, outliers are difficult to detect especially when the dimension of the data increase. Mahalanobis distance (MD) has been one of the classical methods to detect outliers for multivariate data. However, the classical mean and covariance matrix in MD suffered from masking and swamping effects if the data contain outliers. Due to this problem, many studies used a robust estimator instead of the classical estimator of mean and covariance matrix. In this study, the performance of five robust estimators namely Fast Minimum Covariance Determinant (FMCD), Minimum Vector Variance (MVV), Covariance Matrix Equality (CME), Index Set Equality (ISE),and Test on Covariance (TOC) are investigated and compared. FMCD has been widely used and is known as among the best robust estimator. However, there are certain conditions that FMCD still lacks. MVV, CME, ISE and TOC are innovative of FMCD. These four robust estimators improve the last step of the FMCD algorithm. Hence, the objective of this study is to observe the performance of these five estimator to detect outliers in multivariate data particularly TOC as TOC is the latest robust estimator. Simulation studies are conducted for two outlier scenarios with various conditions. There are three performance measures, which are pout, pmask and pswamp used to measure the performance of the robust estimators. It is found that the TOC gives better performance in pswamp for most conditions. TOC gives better results for pout and pmask for certain conditions.
format Article
author Sharifah Sakinah, Syed Abd Mutalib
Siti Zanariah, Satari
Wan Nur Syahidah, Wan Yusoff
author_facet Sharifah Sakinah, Syed Abd Mutalib
Siti Zanariah, Satari
Wan Nur Syahidah, Wan Yusoff
author_sort Sharifah Sakinah, Syed Abd Mutalib
title Comparison of Robust Estimators’ Performance for Detecting Outliers in Multivariate Data
title_short Comparison of Robust Estimators’ Performance for Detecting Outliers in Multivariate Data
title_full Comparison of Robust Estimators’ Performance for Detecting Outliers in Multivariate Data
title_fullStr Comparison of Robust Estimators’ Performance for Detecting Outliers in Multivariate Data
title_full_unstemmed Comparison of Robust Estimators’ Performance for Detecting Outliers in Multivariate Data
title_sort comparison of robust estimators’ performance for detecting outliers in multivariate data
publisher Universiti Teknologi MARA (UiTM)
publishDate 2021
url http://umpir.ump.edu.my/id/eprint/32427/8/Comparison%20of%20Robust%20Estimators.pdf
http://umpir.ump.edu.my/id/eprint/32427/
https://ejournal.um.edu.my/index.php/JOSMA/article/view/32399
_version_ 1715189886554734592