|
An unsupervised learning algorithm for robust clustering and estimating the feasible number of clusters |
|---|---|
| รหัสดีโอไอ | |
| Title | An unsupervised learning algorithm for robust clustering and estimating the feasible number of clusters |
| Creator | Ureerat Wattanachon |
| Contributor | Chidchanok Lursinsap |
| Publisher | Chulalongkorn University |
| Publication Year | 2549 |
| Keyword | Cluster analysis, Learning -- Computer simulation |
| Abstract | Data clustering is a discovery process that groups a set of data such that the data points in the same cluster are as similar as possible and the data points of different clusters are as dissimilar as possible. Existing clustering algorithms, such as single-link clustering, k-means, CURE, and CSM are designed to find clusters based on pre-defined parameters specified by users. These algorithms can breakdown if the choice of parameters is incorrect with respect to the data set being clustered. Most of these algorithms work very well for compact and hyperspherical clusters. In this dissertation, the new hybrid clustering algorithm called “Self-Partition and Self-Merging” (SPSM) is proposed. The SPSM algorithm partitions the input data set into several subclusters in the first phase, and then removes the noisy data and the noisy subclusters in the second phase. In the third phase, the dense subclusters are continuously merged to form the larger clusters based on the inter-distance and intra-distance criteria. From the experimental results, the SPSM algorithm is very efficient to handle the noisy data set. Moreover, the SPSM algorithm is able to cluster the data sets of arbitrary shapes and different density very efficiently, tolerate to noise, and provide better clustering results than the existing clustering algorithms. The computational complexity of the SPSM algorithm is O(N [superscript 2]), where N is the number of data points. |
| ISBN | 9741434251 |
| URL Website | cuir.car.chula.ac.th |