scispace - formally typeset

Topic

Rand index

About: Rand index is a(n) research topic. Over the lifetime, 630 publication(s) have been published within this topic receiving 20373 citation(s).
Papers
More filters

Journal ArticleDOI
William M. Rand1Institutions (1)
TL;DR: This article proposes several criteria which isolate specific aspects of the performance of a method, such as its retrieval of inherent structure, its sensitivity to resampling and the stability of its results in the light of new data.
Abstract: Many intuitively appealing methods have been suggested for clustering data, however, interpretation of their results has been hindered by the lack of objective criteria. This article proposes several criteria which isolate specific aspects of the performance of a method, such as its retrieval of inherent structure, its sensitivity to resampling and the stability of its results in the light of new data. These criteria depend on a measure of similarity between two different clusterings of the same set of data; the measure essentially considers how each pair of data points is assigned in each clustering.

5,541 citations


Journal Article
Marjan Cugmas1, Anuška Ferligoj1Institutions (1)
Abstract: Rand (1971) proposed the Rand Index to measure the stability of two partitions of one set of units. Hubert and Arabie (1985) corrected the Rand Index for chance (Adjusted Rand Index). In this paper, we present some alternative indices. The proposed indices do not assume one set of units for two partitions. Here, one set of units can be a subset of the other set of units. According to the purpose of the comparison of two partitions, the merging and splitting of clusters in two partitions can have different impact on the value of the indices. Therefore, we proposed different modified Rand Indices.

1,836 citations


Journal ArticleDOI
Hae-Sang Park1, Chi-Hyuck Jun1Institutions (1)
TL;DR: Experimental results show that the proposed algorithm takes a significantly reduced time in computation with comparable performance against the partitioning around medoids.
Abstract: This paper proposes a new algorithm for K-medoids clustering which runs like the K-means algorithm and tests several methods for selecting initial medoids. The proposed algorithm calculates the distance matrix once and uses it for finding new medoids at every iterative step. To evaluate the proposed algorithm, we use some real and artificial data sets and compare with the results of other algorithms in terms of the adjusted Rand index. Experimental results show that the proposed algorithm takes a significantly reduced time in computation with comparable performance against the partitioning around medoids.

1,246 citations


Proceedings ArticleDOI
Marina Meilǎ1Institutions (1)
07 Aug 2005
TL;DR: This paper views clusterings as elements of a lattice and gives an axiomatic characterization of some criteria for comparing clusterings, including the variation of information and the unadjusted Rand index, and proves an impossibility result: there is no "sensible" criterion for comparing clusters that is simultaneously aligned with the lattice of partitions, convexely additive, and bounded.
Abstract: This paper views clusterings as elements of a lattice. Distances between clusterings are analyzed in their relationship to the lattice. From this vantage point, we first give an axiomatic characterization of some criteria for comparing clusterings, including the variation of information and the unadjusted Rand index. Then we study other distances between partitions w.r.t these axioms and prove an impossibility result: there is no "sensible" criterion for comparing clusterings that is simultaneously (1) aligned with the lattice of partitions, (2) convexely additive, and (3) bounded.

616 citations


Proceedings ArticleDOI
Nguyen Xuan Vinh1, Julien Epps1, James Bailey2Institutions (2)
14 Jun 2009
TL;DR: This paper derives the analytical formula for the expected mutual information value between a pair of clusterings, and proposes the adjusted version for several popular information theoretic based measures.
Abstract: Information theoretic based measures form a fundamental class of similarity measures for comparing clusterings, beside the class of pair-counting based and set-matching based measures. In this paper, we discuss the necessity of correction for chance for information theoretic based measures for clusterings comparison. We observe that the baseline for such measures, i.e. average value between random partitions of a data set, does not take on a constant value, and tends to have larger variation when the ratio between the number of data points and the number of clusters is small. This effect is similar in some other non-information theoretic based measures such as the well-known Rand Index. Assuming a hypergeometric model of randomness, we derive the analytical formula for the expected mutual information value between a pair of clusterings, and then propose the adjusted version for several popular information theoretic based measures. Some examples are given to demonstrate the need and usefulness of the adjusted measures.

614 citations


Network Information
Related Topics (5)
Cluster analysis

146.5K papers, 2.9M citations

83% related
Support vector machine

73.6K papers, 1.7M citations

80% related
Feature (computer vision)

128.2K papers, 1.7M citations

78% related
Deep learning

79.8K papers, 2.1M citations

78% related
Feature extraction

111.8K papers, 2.1M citations

78% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202170
202064
201945
201842
201750
201652