An optimized self organizing map for cluster ambiguity detection

The Self Organizing Map (SOM) proposed by T.Kohonen (1982), has been widely used in industrial applications such as pattern recognition, biological modelling, data compression, signal processing and data mining (T. Kohonen, 1997; M.N.M Sap and E. Mohebi, 2008a, 2008b, 2008c). It is an unsupervise...

Full description

Saved in:
Bibliographic Details
Main Authors: Mohebi, Ehsan, Md. Sap, Mohd. Noor
Format: Book Section
Published: Penerbit UTM 2008
Subjects:
Online Access:http://eprints.utm.my/id/eprint/16788/
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Teknologi Malaysia
id my.utm.16788
record_format eprints
spelling my.utm.167882017-02-05T04:33:07Z http://eprints.utm.my/id/eprint/16788/ An optimized self organizing map for cluster ambiguity detection Mohebi, Ehsan Md. Sap, Mohd. Noor QA75 Electronic computers. Computer science The Self Organizing Map (SOM) proposed by T.Kohonen (1982), has been widely used in industrial applications such as pattern recognition, biological modelling, data compression, signal processing and data mining (T. Kohonen, 1997; M.N.M Sap and E. Mohebi, 2008a, 2008b, 2008c). It is an unsupervised and nonparametric neural network approach. The success of the SOM algorithm lies in its simplicity that makes it easy to understand, simulate and be used in many applications. The basic SOM consists of neurons usually arranged in a two-dimensional structure such that there are neighbourhood relations among the neurons. After completion of training, each neuron is attached to a feature vector of the same dimension as input space. By assigning each input vector to the neuron with nearest feature vectors, the SOM is able to divide the input space into regions (clusters) with common nearest feature vectors. This process can be considered as performing vector quantization (VQ) (R.M. Gray, 1984). In addition, because of the neighborhood relation contributed by the inter-connections among neurons, the SOM exhibits another important property of topology preservation. Clustering algorithms attempt to organize unlabeled input vectors into clusters such that points within the cluster are more similar to each other than vectors belonging to different clusters (N. R. Pal, et al., 1993). The clustering methods are of five types: hierarchical clustering, partitioning clustering, density-based clustering, grid-based clustering and model-based clustering (J. Han and M. Kamber, 2000). The rough set theory employs two upper and lower thresholds in the clustering process, which result in a rough clusters appearance. This technique also could be defined in incremental order i.e. the number of clusters is not predefined by users. In this chapter, a new two-level clustering algorithm is proposed. The idea is that the first level is to train the data by the SOM neural network and then clustering at the second level is a rough set based incremental clustering approach (S. Ashraf, et al., 2006), which will be applied on the output of SOM and requires only a single neurons scan. The optimal number of clusters can be found by rough set theory, which groups the given neurons into a set of overlapping clusters (clusters the mapped data respectively). Then the overlapped neurons will be assigned to the true clusters they belong to, by apply simulated annealing algorithm. A simulated annealing algorithm has been adopted to minimize the uncertainty that comes from some clustering operations. In our previous work (M.N.M. Sap and E. Mohebi, 2008a) the hybrid SOM and rough set has been applied to catch the overlapped data only, but the experiment results show that the proposed algorithm (SA-Rough SOM) outperforms the previous one. This chapter is organized as following; in section 2, the basics of SOM algorithm are outlined. The Incremental Clustering and Rough set theory are described in section 3. In section 4, the essence of simulated annealing is described. The proposed algorithm is presented in section 5. Section 6 is dedicated to experiment results, section 7 provides brief conclusion, and future works and an outline of the chapter summary is described in section 8. Penerbit UTM 2008 Book Section PeerReviewed Mohebi, Ehsan and Md. Sap, Mohd. Noor (2008) An optimized self organizing map for cluster ambiguity detection. In: Advances in image processing and pattern recognition: algorithms & practice, Vol. II. Penerbit UTM , Johor, 217-240 . ISBN 978-983-52-0618-4
institution Universiti Teknologi Malaysia
building UTM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Malaysia
content_source UTM Institutional Repository
url_provider http://eprints.utm.my/
topic QA75 Electronic computers. Computer science
spellingShingle QA75 Electronic computers. Computer science
Mohebi, Ehsan
Md. Sap, Mohd. Noor
An optimized self organizing map for cluster ambiguity detection
description The Self Organizing Map (SOM) proposed by T.Kohonen (1982), has been widely used in industrial applications such as pattern recognition, biological modelling, data compression, signal processing and data mining (T. Kohonen, 1997; M.N.M Sap and E. Mohebi, 2008a, 2008b, 2008c). It is an unsupervised and nonparametric neural network approach. The success of the SOM algorithm lies in its simplicity that makes it easy to understand, simulate and be used in many applications. The basic SOM consists of neurons usually arranged in a two-dimensional structure such that there are neighbourhood relations among the neurons. After completion of training, each neuron is attached to a feature vector of the same dimension as input space. By assigning each input vector to the neuron with nearest feature vectors, the SOM is able to divide the input space into regions (clusters) with common nearest feature vectors. This process can be considered as performing vector quantization (VQ) (R.M. Gray, 1984). In addition, because of the neighborhood relation contributed by the inter-connections among neurons, the SOM exhibits another important property of topology preservation. Clustering algorithms attempt to organize unlabeled input vectors into clusters such that points within the cluster are more similar to each other than vectors belonging to different clusters (N. R. Pal, et al., 1993). The clustering methods are of five types: hierarchical clustering, partitioning clustering, density-based clustering, grid-based clustering and model-based clustering (J. Han and M. Kamber, 2000). The rough set theory employs two upper and lower thresholds in the clustering process, which result in a rough clusters appearance. This technique also could be defined in incremental order i.e. the number of clusters is not predefined by users. In this chapter, a new two-level clustering algorithm is proposed. The idea is that the first level is to train the data by the SOM neural network and then clustering at the second level is a rough set based incremental clustering approach (S. Ashraf, et al., 2006), which will be applied on the output of SOM and requires only a single neurons scan. The optimal number of clusters can be found by rough set theory, which groups the given neurons into a set of overlapping clusters (clusters the mapped data respectively). Then the overlapped neurons will be assigned to the true clusters they belong to, by apply simulated annealing algorithm. A simulated annealing algorithm has been adopted to minimize the uncertainty that comes from some clustering operations. In our previous work (M.N.M. Sap and E. Mohebi, 2008a) the hybrid SOM and rough set has been applied to catch the overlapped data only, but the experiment results show that the proposed algorithm (SA-Rough SOM) outperforms the previous one. This chapter is organized as following; in section 2, the basics of SOM algorithm are outlined. The Incremental Clustering and Rough set theory are described in section 3. In section 4, the essence of simulated annealing is described. The proposed algorithm is presented in section 5. Section 6 is dedicated to experiment results, section 7 provides brief conclusion, and future works and an outline of the chapter summary is described in section 8.
format Book Section
author Mohebi, Ehsan
Md. Sap, Mohd. Noor
author_facet Mohebi, Ehsan
Md. Sap, Mohd. Noor
author_sort Mohebi, Ehsan
title An optimized self organizing map for cluster ambiguity detection
title_short An optimized self organizing map for cluster ambiguity detection
title_full An optimized self organizing map for cluster ambiguity detection
title_fullStr An optimized self organizing map for cluster ambiguity detection
title_full_unstemmed An optimized self organizing map for cluster ambiguity detection
title_sort optimized self organizing map for cluster ambiguity detection
publisher Penerbit UTM
publishDate 2008
url http://eprints.utm.my/id/eprint/16788/
_version_ 1643646660918640640