scispace - formally typeset
Open AccessJournal ArticleDOI

Judging the Quality of Gene Expression-Based Clustering Methods Using Gene Annotation

Francis D. Gibbons, +1 more
- 01 Oct 2002 - 
- Vol. 12, Iss: 10, pp 1574-1581
TLDR
It is concluded that enrichment of clusters for biological function is, in general, highest at rather low cluster numbers, and no method outperforms Euclidean distance for ratio-based measurements, or Pearson distance at the optimal choice of cluster number.
Abstract
We compare several commonly used expression-based gene clustering algorithms using a figure of merit based on the mutual information between cluster membership and known gene attributes. By studying various publicly available expression data sets we conclude that enrichment of clusters for biological function is, in general, highest at rather low cluster numbers. As a measure of dissimilarity between the expression patterns of two genes, no method outperforms Euclidean distance for ratio-based measurements, or Pearson distance for non-ratio-based measurements at the optimal choice of cluster number. We show the self-organized-map approach to be best for both measurement types at higher numbers of clusters. Clusters of genes derived from single- and average-linkage hierarchical clustering tend to produce worse-than-random results. [The algorithm described is available at http://llama.med.harvard.edu, under Software.]

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Gene regulatory network inference: Data integration in dynamic models—A review

TL;DR: This review deals with the reconstruction of gene regulatory networks (GRNs) from experimental data through computational methods and approaches are discussed that enable the modelling of the dynamics of Gene regulatory systems.
Journal ArticleDOI

Mapping the backbone of science

TL;DR: A new map representing the structure of all of science, based on journal articles, is presented, including both the natural and social sciences, including biochemistry, which appears as the most interdisciplinary discipline in science.
Journal ArticleDOI

clValid: An R Package for Cluster Validation

TL;DR: The R package clValid contains functions for validating the results of a clustering analysis, and the user can choose from nine clustering algorithms in existing R packages, including hierarchical, K-means, self-organizing maps (SOM), to choose from.

Mapping the backbone of science.

TL;DR: In this article, the authors presented a new map representing the structure of all of science, based on journal articles, including both the natural and social sciences, which provides a bird's eye view of today's scientific landscape.
References
More filters
Book

Elements of information theory

TL;DR: The author examines the role of entropy, inequality, and randomness in the design of codes and the construction of codes in the rapidly changing environment.
Journal ArticleDOI

Gene Ontology: tool for the unification of biology

TL;DR: The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing.
Journal ArticleDOI

Numerical recipes

Journal ArticleDOI

Cluster analysis and display of genome-wide expression patterns

TL;DR: A system of cluster analysis for genome-wide expression data from DNA microarray hybridization is described that uses standard statistical algorithms to arrange genes according to similarity in pattern of gene expression, finding in the budding yeast Saccharomyces cerevisiae that clustering gene expression data groups together efficiently genes of known similar function.
Journal ArticleDOI

Molecular classification of cancer: class discovery and class prediction by gene expression monitoring.

TL;DR: A generic approach to cancer classification based on gene expression monitoring by DNA microarrays is described and applied to human acute leukemias as a test case and suggests a general strategy for discovering and predicting cancer classes for other types of cancer, independent of previous biological knowledge.
Related Papers (5)