Topic

Rand index

About: Rand index is a research topic. Over the lifetime, 630 publications have been published within this topic receiving 20373 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Omega: A General Formulation of the Rand Index of Cluster Recovery Suitable for Non-disjoint Solutions

[...]

Linda M. Collins¹, Clyde W. Dent²•Institutions (2)

Pennsylvania State University¹, University of Southern California²

01 Apr 1988-Multivariate Behavioral Research

TL;DR: The present paper introduces a generalization of the Hubert and Arabie adjusted Rand index, called the Omega index, which can be applied to situations where both, one, or neither of the solutions being compared is non-disjoint.

...read moreread less

Abstract: Cluster recovery indices are more important than ever, because of the necessity for comparing the large number of clustering procedures available today. Of the cluster recovery indices prominent in contemporary literature, the Hubert and Arabie (1985) adjustment to the Rand index (1971) has been demonstrated to have the most desirable properties (Milligan & Cooper, 1986). However, use of the Hubert and Arabie adjustment to the Rand index is limited to cluster solutions involving non-overlapping, or disjoint, clusters. The present paper introduces a generalization of the Hubert and Arabie adjusted Rand index. This generalization, called the Omega index, can be applied to situations where both, one, or neither of the solutions being compared is non-disjoint. In the special case where both solutions are disjoint, the Omega index is equivalent to the Hubert and Arabie adjusted Rand index.

...read moreread less

154 citations

Journal Article•DOI•

On the Equivalence of Cohen's Kappa and the Hubert-Arabie Adjusted Rand Index

[...]

Matthijs J. Warrens¹•Institutions (1)

Leiden University¹

01 Nov 2008-Journal of Classification

TL;DR: It is shown that one can calculate the Hubert-Arabie adjusted Rand index by first forming the fourfold contingency table counting the number of pairs of objects that were placed in the same cluster in both partitions.

...read moreread less

Abstract: It is shown that one can calculate the Hubert-Arabie adjusted Rand index by first forming the fourfold contingency table counting the number of pairs of objects that were placed in the same cluster in both partitions, in the same cluster in one partition but in different clusters in the other partition, and in different clusters in both, and then computing Cohen's ? on this fourfold table.

...read moreread less

153 citations

Journal Article•DOI•

Improved criteria for clustering based on the posterior similarity matrix

[...]

Arno Fritsch¹, Katja Ickstadt•Institutions (1)

Technical University of Dortmund¹

01 Jun 2009-Bayesian Analysis

TL;DR: New criteria for estimating a clustering, which are based on the posterior expected adjusted Rand index, are proposed and are shown to possess a shrinkage property and outperform Binder's loss in a simulation study and in an application to gene expression data.

...read moreread less

Abstract: In this paper we address the problem of obtaining a single clustering estimate bc based on an MCMC sample of clusterings c (1) ;c (2) :::;c (M) from the posterior distribution of a Bayesian cluster model. Methods to derive b when the number of groups K varies between the clusterings are reviewed and discussed. These include the maximum a posteriori (MAP) estimate and methods based on the posterior similarity matrix, a matrix containing the posterior probabilities that the observations i and j are in the same cluster. The posterior similarity matrix is related to a commonly used loss function by Binder (1978). Minimization of the loss is shown to be equivalent to maximizing the Rand index between esti- mated and true clustering. We propose new criteria for estimating a clustering, which are based on the posterior expected adjusted Rand index. The criteria are shown to possess a shrinkage property and outperform Binder's loss in a simulation study and in an application to gene expression data. They also perform favorably compared to other clustering procedures.

...read moreread less

145 citations

Journal Article•DOI•

BinSanity: unsupervised clustering of environmental microbial assemblies using coverage and affinity propagation

[...]

Elaina D. Graham¹, John F. Heidelberg¹, Benjamin J. Tully¹•Institutions (1)

University of Southern California¹

08 Mar 2017-PeerJ

TL;DR: A new binning method, BinSanity, that utilizes the clustering algorithm affinity propagation (AP), to cluster assemblies using coverage with compositional based refinement (tetranucleotide frequency and percent GC content) to optimize bins containing multiple source organisms.

...read moreread less

Abstract: Metagenomics has become an integral part of defining microbial diversity in various environments. Many ecosystems have characteristically low biomass and few cultured representatives. Linking potential metabolisms to phylogeny in environmental microorganisms is important for interpreting microbial community functions and the impacts these communities have on geochemical cycles. However, with metagenomic studies there is the computational hurdle of 'binning' contigs into phylogenetically related units or putative genomes. Binning methods have been implemented with varying approaches such as k-means clustering, Gaussian mixture models, hierarchical clustering, neural networks, and two-way clustering; however, many of these suffer from biases against low coverage/abundance organisms and closely related taxa/strains. We are introducing a new binning method, BinSanity, that utilizes the clustering algorithm affinity propagation (AP), to cluster assemblies using coverage with compositional based refinement (tetranucleotide frequency and percent GC content) to optimize bins containing multiple source organisms. This separation of composition and coverage based clustering reduces bias for closely related taxa. BinSanity was developed and tested on artificial metagenomes varying in size and complexity. Results indicate that BinSanity has a higher precision, recall, and Adjusted Rand Index compared to five commonly implemented methods. When tested on a previously published environmental metagenome, BinSanity generated high completion and low redundancy bins corresponding with the published metagenome-assembled genomes.

...read moreread less

143 citations

Proceedings Article•DOI•

Boundary Learning by Optimization with Topological Constraints

[...]

Viren Jain¹, Benjamin Bollmann¹, Mark A. Richardson¹, Daniel R. Berger¹, Moritz Helmstaedter², Kevin L. Briggman², Winfried Denk², Jared B. Bowden³, John M. Mendenhall³, Wickliffe C. Abraham⁴, Kristen M. Harris³, Narayanan Kasthuri⁵, Kenneth J. Hayworth⁵, Richard Schalek⁵, Juan Carlos Tapia⁵, Jeff W. Lichtman⁵, H. Sebastian Seung¹ - Show less +13 more•Institutions (5)

Massachusetts Institute of Technology¹, Max Planck Society², University of Texas at Austin³, University of Otago⁴, Harvard University⁵

13 Jun 2010

TL;DR: This work proposes a new metric called the warping error that tolerates disagreements over boundary location, penalizes topological disagreements, and can be used directly as a cost function for learning boundary detection, in a method that it is called Boundary Learning by Optimization with Topological Constraints (BLOTC).

...read moreread less

Abstract: Recent studies have shown that machine learning can improve the accuracy of detecting object boundaries in images. In the standard approach, a boundary detector is trained by minimizing its pixel-level disagreement with human boundary tracings. This naive metric is problematic because it is overly sensitive to boundary locations. This problem is solved by metrics provided with the Berkeley Segmentation Dataset, but these can be insensitive to topo-logical differences, such as gaps in boundaries. Furthermore, the Berkeley metrics have not been useful as cost functions for supervised learning. Using concepts from digital topology, we propose a new metric called the warping error that tolerates disagreements over boundary location, penalizes topological disagreements, and can be used directly as a cost function for learning boundary detection, in a method that we call Boundary Learning by Optimization with Topological Constraints (BLOTC). We trained boundary detectors on electron microscopic images of neurons, using both BLOTC and standard training. BLOTC produced substantially better performance on a 1.2 million pixel test set, as measured by both the warping error and the Rand index evaluated on segmentations generated from the boundary labelings. We also find our approach yields significantly better segmentation performance than either gPb-OWT-UCM or multiscale normalized cut, as well as Boosted Edge Learning trained directly on our data.

...read moreread less

139 citations

Collapse

Network Information

Performance

Metrics

660

Papers

24,443

Citations

No. of papers in the topic in previous years
Year	Papers
2023	8
2022	22
2021	70
2020	64
2019	45
2018	42

Rand index

Papers published on a yearly basis

Papers

Trending Questions (5)

Network Information

Related Topics (5)

Performance

Metrics