Outlier detection

Outlier detection aims to capture or identify uncommon events or instances. This technique has been widely used in applications such as fraud detection, image processing and bioinformatics. Because of its diverse usage, outlier detection has emerged as a vibrant research topic in the fields of data...

Full description

Saved in:
Bibliographic Details
Main Author: Li, Shukai
Other Authors: Ng Wee Keong
Format: Thesis-Doctor of Philosophy
Language:English
Published: Nanyang Technological University 2013
Subjects:
Online Access:http://hdl.handle.net/10356/52515
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-52515
record_format dspace
spelling sg-ntu-dr.10356-525152023-03-04T00:37:59Z Outlier detection Li, Shukai Ng Wee Keong School of Computer Engineering Centre for Computational Intelligence AWKNG@ntu.edu.sg DRNTU::Engineering::Computer science and engineering::Computing methodologies::Pattern recognition Outlier detection aims to capture or identify uncommon events or instances. This technique has been widely used in applications such as fraud detection, image processing and bioinformatics. Because of its diverse usage, outlier detection has emerged as a vibrant research topic in the fields of data mining, machine learning and statistics. In this thesis, we investigate four different kinds of outlier detection problems. Amongst them, unsupervised outlier detection has been the most popular, while relative outlier detection has attracted increasing attention in recent years. Thus, our research will focus on these two classes of outlier detection problems. Unsupervised outlier detection methods are used when there are no labeled patterns. For this kind of problems, we propose a Maximum Margin Criterion to segregate the unknown outliers from the normal patterns in a given set of samples. However, the corresponding learning task is formulated as a Mixed Integer Programming (MIP) problem, which is computationally hard. To address this issue, we adopt a recently developed label generating technique to efficiently solve a convex relaxation of the MIP problem for outlier detection. Specifically, we propose an effective procedure of successive approximation to find a largely violated labeling vector for identifying the outliers from the normal patterns. The convergence of such a procedure has also been established and presented. Subsequently, a set of largely violated labeling vectors are combined via multiple kernel learning methods to robustly detect the outliers. To further enhance the efficacy of our outlier detector, we also explore the use of the Maximum Volume Criterion to measure the quality of separation between the outliers and the normal patterns. This criterion can be easily incorporated into our proposed model by introducing an additional regularization term. The efforts culminate to two novel outlier detection models named Maximum Margin Outlier Detection (MMOD) and Maximum Volume Outlier Detection (MVOD) respectively. Doctor of Philosophy 2013-05-15T03:29:44Z 2013-05-15T03:29:44Z 2013 2013 Thesis-Doctor of Philosophy http://hdl.handle.net/10356/52515 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). 137 p. application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering::Computing methodologies::Pattern recognition
spellingShingle DRNTU::Engineering::Computer science and engineering::Computing methodologies::Pattern recognition
Li, Shukai
Outlier detection
description Outlier detection aims to capture or identify uncommon events or instances. This technique has been widely used in applications such as fraud detection, image processing and bioinformatics. Because of its diverse usage, outlier detection has emerged as a vibrant research topic in the fields of data mining, machine learning and statistics. In this thesis, we investigate four different kinds of outlier detection problems. Amongst them, unsupervised outlier detection has been the most popular, while relative outlier detection has attracted increasing attention in recent years. Thus, our research will focus on these two classes of outlier detection problems. Unsupervised outlier detection methods are used when there are no labeled patterns. For this kind of problems, we propose a Maximum Margin Criterion to segregate the unknown outliers from the normal patterns in a given set of samples. However, the corresponding learning task is formulated as a Mixed Integer Programming (MIP) problem, which is computationally hard. To address this issue, we adopt a recently developed label generating technique to efficiently solve a convex relaxation of the MIP problem for outlier detection. Specifically, we propose an effective procedure of successive approximation to find a largely violated labeling vector for identifying the outliers from the normal patterns. The convergence of such a procedure has also been established and presented. Subsequently, a set of largely violated labeling vectors are combined via multiple kernel learning methods to robustly detect the outliers. To further enhance the efficacy of our outlier detector, we also explore the use of the Maximum Volume Criterion to measure the quality of separation between the outliers and the normal patterns. This criterion can be easily incorporated into our proposed model by introducing an additional regularization term. The efforts culminate to two novel outlier detection models named Maximum Margin Outlier Detection (MMOD) and Maximum Volume Outlier Detection (MVOD) respectively.
author2 Ng Wee Keong
author_facet Ng Wee Keong
Li, Shukai
format Thesis-Doctor of Philosophy
author Li, Shukai
author_sort Li, Shukai
title Outlier detection
title_short Outlier detection
title_full Outlier detection
title_fullStr Outlier detection
title_full_unstemmed Outlier detection
title_sort outlier detection
publisher Nanyang Technological University
publishDate 2013
url http://hdl.handle.net/10356/52515
_version_ 1759857722167132160