Topic

Rand index

About: Rand index is a research topic. Over the lifetime, 630 publications have been published within this topic receiving 20373 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Comparative evaluation of community detection algorithms: a topological approach

[...]

Günce Keziban Orman¹, Günce Keziban Orman², Vincent Labatut², Hocine Cherifi¹•Institutions (2)

University of Burgundy¹, Galatasaray University²

01 Aug 2012-Journal of Statistical Mechanics: Theory and Experiment

TL;DR: In this paper, a comprehensive comparative study of a representative set of community detection methods is presented, in which community-oriented topological measures are used to qualify the communities and evaluate their deviation from the reference structure.

...read moreread less

Abstract: Community detection is one of the most active fields in complex network analysis, due to its potential value in practical applications. Many works inspired by different paradigms are devoted to the development of algorithmic solutions allowing the network structure in such cohesive subgroups to be revealed. Comparative studies reported in the literature usually rely on a performance measure considering the community structure as a partition (Rand index, normalized mutual information, etc). However, this type of comparison neglects the topological properties of the communities. In this paper, we present a comprehensive comparative study of a representative set of community detection methods, in which we adopt both types of evaluation. Community-oriented topological measures are used to qualify the communities and evaluate their deviation from the reference structure. In order to mimic real-world systems, we use artificially generated realistic networks. It turns out there is no equivalence between the two approaches: a high performance does not necessarily correspond to correct topological properties, and vice versa. They can therefore be considered as complementary, and we recommend applying both of them in order to perform a complete and accurate assessment.

...read moreread less

135 citations

Journal Article•DOI•

A variable-selection heuristic for K-means clustering

[...]

Michael J. Brusco¹, J. Dennis Cradit¹•Institutions (1)

Florida State University¹

01 Jun 2001-Psychometrika

TL;DR: A cluster analysis of real-world financial services data revealed that using the variable-selection heuristic prior to the K-means algorithm resulted in greater cluster stability, indicating the heuristic is extremely effective at eliminating masking variables.

...read moreread less

Abstract: One of the most vexing problems in cluster analysis is the selection and/or weighting of variables in order to include those that truly define cluster structure, while eliminating those that might mask such structure. This paper presents a variable-selection heuristic for nonhierarchical (K-means) cluster analysis based on the adjusted Rand index for measuring cluster recovery. The heuristic was subjected to Monte Carlo testing across more than 2200 datasets with known cluster structure. The results indicate the heuristic is extremely effective at eliminating masking variables. A cluster analysis of real-world financial services data revealed that using the variable-selection heuristic prior to the K-means algorithm resulted in greater cluster stability.

...read moreread less

131 citations

Journal Article•DOI•

Comparing Fuzzy, Probabilistic, and Possibilistic Partitions

[...]

Derek T. Anderson¹, James C. Bezdek¹, Mihail Popescu¹, James M. Keller¹•Institutions (1)

University of Missouri¹

01 Oct 2010-IEEE Transactions on Fuzzy Systems

TL;DR: This paper generalizes many of the classical indices that have been used with outputs of crisp clustering algorithms so that they are applicable for candidate partitions of any type (i.e., crisp or soft, with soft comprising the fuzzy, probabilistic, and possibilistic cases).

...read moreread less

Abstract: When clustering produces more than one candidate to partition a finite set of objects O, there are two approaches to validation (i.e., selection of a “best” partition, and implicitly, a best value for c , which is the number of clusters in O). First, we may use an internal index, which evaluates each partition separately. Second, we may compare pairs of candidates with each other, or with a reference partition that purports to represent the “true” cluster structure in the objects. This paper generalizes many of the classical indices that have been used with outputs of crisp clustering algorithms so that they are applicable for candidate partitions of any type (i.e., crisp or soft, with soft comprising the fuzzy, probabilistic, and possibilistic cases). Space prevents inclusion of all of the possible generalizations that can be realized this way. Here, we concentrate on the Rand index and its modifications. We compare our fuzzy-Rand index with those of Campello, Hullermeier and Rifqi, and Brouwer, and show that our extension of the Rand index is O(n), while the other three are all O(n2). Numerical examples are given to illustrate various facets of the new indices. In particular, we show that our indices can be used, even when the partitions are probabilistic or possibilistic, and that our method of generalization is valid for any index that depends only on the entries of the classical (i.e., four-pair types) contingency table for this problem.

...read moreread less

123 citations

Journal Article•DOI•

Adjusting for chance clustering comparison measures

[...]

Simone Romano¹, Nguyen Xuan Vinh¹, James Bailey¹, Karin Verspoor¹•Institutions (1)

University of Melbourne¹

01 Jan 2016-Journal of Machine Learning Research

TL;DR: This paper solves the key technical challenge of analytically computing the expected value and variance of generalized IT measures and proposes guidelines for using ARI and AMI as external validation indices.

...read moreread less

Abstract: Adjusted for chance measures are widely used to compare partitions/clusterings of the same data set. In particular, the Adjusted Rand Index (ARI) based on pair-counting, and the Adjusted Mutual Information (AMI) based on Shannon information theory are very popular in the clustering community. Nonetheless it is an open problem as to what are the best application scenarios for each measure and guidelines in the literature for their usage are sparse, with the result that users often resort to using both. Generalized Information Theoretic (IT) measures based on the Tsallis entropy have been shown to link pair-counting and Shannon IT measures. In this paper, we aim to bridge the gap between adjustment of measures based on pair-counting and measures based on information theory. We solve the key technical challenge of analytically computing the expected value and variance of generalized IT measures. This allows us to propose adjustments of generalized IT measures, which reduce to well known adjusted clustering comparison measures as special cases. Using the theory of generalized IT measures, we are able to propose the following guidelines for using ARI and AMI as external validation indices: ARI should be used when the reference clustering has large equal sized clusters; AMI should be used when the reference clustering is unbalanced and there exist small clusters.

...read moreread less

123 citations

Journal Article•DOI•

Comparative Evaluation of Community Detection Algorithms: A Topological Approach

[...]

Günce Keziban Orman¹, Günce Keziban Orman², Vincent Labatut¹, Hocine Cherifi²•Institutions (2)

Galatasaray University¹, University of Burgundy²

21 Jun 2012-arXiv: Social and Information Networks

TL;DR: A comprehensive comparative study of a representative set of community detection methods, in which community-oriented topological measures are used to qualify the communities and evaluate their deviation from the reference structure and it turns out there is no equivalence between the two approaches.

...read moreread less

Abstract: Community detection is one of the most active fields in complex networks analysis, due to its potential value in practical applications. Many works inspired by different paradigms are devoted to the development of algorithmic solutions allowing to reveal the network structure in such cohesive subgroups. Comparative studies reported in the literature usually rely on a performance measure considering the community structure as a partition (Rand Index, Normalized Mutual information, etc.). However, this type of comparison neglects the topological properties of the communities. In this article, we present a comprehensive comparative study of a representative set of community detection methods, in which we adopt both types of evaluation. Community-oriented topological measures are used to qualify the communities and evaluate their deviation from the reference structure. In order to mimic real-world systems, we use artificially generated realistic networks. It turns out there is no equivalence between both approaches: a high performance does not necessarily correspond to correct topological properties, and vice-versa. They can therefore be considered as complementary, and we recommend applying both of them in order to perform a complete and accurate assessment.

...read moreread less

121 citations

Collapse

Network Information

Performance

Metrics

660

Papers

24,443

Citations

No. of papers in the topic in previous years
Year	Papers
2023	8
2022	22
2021	70
2020	64
2019	45
2018	42

Rand index

Papers published on a yearly basis

Papers

Trending Questions (5)

Network Information

Related Topics (5)

Performance

Metrics