scispace - formally typeset
Search or ask a question

Showing papers by "Shu-Chuan Chu published in 2002"


Book ChapterDOI
04 Sep 2002
TL;DR: A novel and efficient approach is proposed to reduce the computational complexity of such k-medoids-based algorithms by using previous medoid index, triangular inequality elimination criteria and partial distance search.
Abstract: Clustering in data mining is a discovery process that groups similar objects into the same cluster. Various clustering algorithms have been designed to fit various requirements and constraints of application. In this paper, we study several k-medoids-based algorithms including the PAM, CLARA and CLARANS algorithms. A novel and efficient approach is proposed to reduce the computational complexity of such k-medoids-based algorithms by using previous medoid index, triangular inequality elimination criteria and partial distance search. Experimental results based on elliptic, curve and Gauss-Markov databases demonstrate that the proposed algorithm applied to CLARANS may reduce the number of distance calculations by 67% to 92% while retaining the same average distance per object. In terms of the running time, the proposed algorithm may reduce computation time by 38% to 65% compared with the CLARANS algorithm.

20 citations


Proceedings ArticleDOI
28 Oct 2002
TL;DR: The hybrid search approach combines the previous medoid index, the utilization of memory, the criterion of triangular inequality elimination and the partial distance search for nearest neighbor search and is applied to the k-medoids-based algorithms.
Abstract: In this paper, the concept of previous medoid index is introduced The utilization of memory for efficient medoid search is also presented We propose a hybrid search approach for the problem of nearest neighbor search The hybrid search approach combines the previous medoid index, the utilization of memory, the criterion of triangular inequality elimination and the partial distance search The proposed hybrid search approach is applied to the k-medoids-based algorithms Experimental results based on Gauss-Markov source, curve data set and elliptic clusters demonstrate that the proposed algorithm applied to the CLARANS algorithm may reduce the number of distance calculations from 884% to 952% with the same average distance per object compared with CLARANS The proposed hybrid search approach can also be applied to nearest neighbor searching and the other clustering algorithms

14 citations


01 Jan 2002
TL;DR: Experimental results demonstrate the proposed scheme can not only reduce by more than 80’ZOcomputation time but also reduce the average distance per object compared with CLARA and CLARANS and is also superior to MCMRS.
Abstract: Data clustering has become an important task for discovering significant patterns and characteristics in large spatial databases. The Mufti- Centroid, Multi-Run Sampling Scheme (MCMRS) has been shown to be effective in improving the k-medoids-based clustering algorit hms in our previous work. In this paper, a more advanced sampling scheme termed Incremental MultiCentrozd, Multi-Run Sampling Scheme (IMCMRS) is proposed for k-medoidsbased clustering algorithms. Experimental results demonstrate the proposed scheme can not only reduce by more than 80’ZOcomputation time but also reduce the average distance per object compared with CLARA and CLARANS. IMCMRS is also superior to MCMRS.

12 citations



Journal Article
TL;DR: In this paper, a novel and efficient approach is proposed to reduce the computational complexity of k-medoid-based algorithms by using previous medoid index, triangular inequality elimination criteria and partial distance search.
Abstract: Clustering in data mining is a discovery process that groups similar objects into the same cluster. Various clustering algorithms have been designed to fit various requirements and constraints of application. In this paper, we study several k-medoids-based algorithms including the PAM, CLARA and CLARANS algorithms. A novel and efficient approach is proposed to reduce the computational complexity of such k-medoids-based algorithms by using previous medoid index, triangular inequality elimination criteria and partial distance search. Experimental results based on elliptic, curve and Gauss-Markov databases demonstrate that the proposed algorithm applied to CLARANS may reduce the number of distance calculations by 67% to 92% while retaining the same average distance per object. In terms of the running time, the proposed algorithm may reduce computation time by 38% to 65% compared with the CLARANS algorithm.

2 citations