Biclustering Algorithms for Biological Data Analysis: A Survey

doi:10.1109/TCBB.2004.2

Journal ArticleDOI

Biclustering Algorithms for Biological Data Analysis: A Survey

Sara C. Madeira, +1 more

- 01 Jan 2004 -

IEEE/ACM Transactions on Computational B...

- Vol. 1, Iss: 1, pp 24-45

Chats0

TLDR

In this comprehensive survey, a large number of existing approaches to biclustering are analyzed, and they are classified in accordance with the type of biclusters they can find, the patterns of bIClusters that are discovered, the methods used to perform the search, the approaches used to evaluate the solution, and the target applications.

Abstract:

A large number of clustering approaches have been proposed for the analysis of gene expression data obtained from microarray experiments. However, the results from the application of standard clustering methods to genes are limited. This limitation is imposed by the existence of a number of experimental conditions where the activity of genes is uncorrelated. A similar limitation exists when clustering of conditions is performed. For this reason, a number of algorithms that perform simultaneous clustering on the row and column dimensions of the data matrix has been proposed. The goal is to find submatrices, that is, subgroups of genes and subgroups of conditions, where the genes exhibit highly correlated activities for every condition. In this paper, we refer to this class of algorithms as biclustering. Biclustering is also referred in the literature as coclustering and direct clustering, among others names, and has also been used in fields such as information retrieval and data mining. In this comprehensive survey, we analyze a large number of existing approaches to biclustering, and classify them in accordance with the type of biclusters they can find, the patterns of biclusters that are discovered, the methods used to perform the search, the approaches used to evaluate the solution, and the target applications.

Citations

PDF

Open Access

More filters

Book

Machine Learning : A Probabilistic Perspective

Kevin P. Murphy

TL;DR: This textbook offers a comprehensive and self-contained introduction to the field of machine learning, based on a unified, probabilistic approach, and is suitable for upper-level undergraduates with an introductory-level college math background and beginning graduate students.

...read moreread less

Journal ArticleDOI

Survey of clustering algorithms

Rui Xu, +1 more

- 01 May 2005 -

IEEE Transactions on Neural Networks

TL;DR: Clustering algorithms for data sets appearing in statistics, computer science, and machine learning are surveyed, and their applications in some benchmark data sets, the traveling salesman problem, and bioinformatics, a new field attracting intensive efforts are illustrated.

...read moreread less

Journal ArticleDOI

Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering

Hans-Peter Kriegel, +2 more

- 23 Mar 2009 -

ACM Transactions on Knowledge Discovery ...

TL;DR: This survey tries to clarify the different problem definitions related to subspace clustering in general; the specific difficulties encountered in this field of research; the varying assumptions, heuristics, and intuitions forming the basis of different approaches; and how several prominent solutions tackle different problems.

...read moreread less

Journal ArticleDOI

A systematic comparison and evaluation of biclustering methods for gene expression data

Amela Prelić, +8 more

- 01 May 2006 -

Bioinformatics

TL;DR: A methodology for comparing and validating biclustering methods that includes a simple binary reference model that captures the essential features of most bic Lustering approaches and proposes a fast divide-and-conquer algorithm (Bimax).

...read moreread less

Journal ArticleDOI

Computational cluster validation in post-genomic data analysis

Julia Handl, +2 more

- 01 Aug 2005 -

Bioinformatics

TL;DR: In this article, the authors present a review of clustering validation techniques for post-genomic data analysis, with a particular focus on their application to postgenomic analysis of biological data.

...read moreread less

Collapse

References

PDF

Open Access

More filters

PatentDOI

Mll translocations specify a distinct gene expression profile, distinguishing a unique leukemia

Todd R. Golub, +2 more

- 17 Jul 2002 -

Nature Genetics

TL;DR: In this paper, the diagnosis of mixed lineage leukemia (MLL), acute lymphoblastic leukemia (ALL), and acute myellgenous leukemia (AML) according to the gene expression profile of a sample from an individual, as well as to methods of therapy and screening that utilize the genes indentified herein as targets.

...read moreread less

Journal ArticleDOI

Gene-Expression Profiles in Hereditary Breast Cancer

Ingrid Hedenfalk, +25 more

- 22 Feb 2001 -

The New England Journal of Medicine

TL;DR: Significantly different groups of genes are expressed by breast cancers with BRCA1 mutations and breast cancersWith BRCa2 mutations, the results suggest that a heritable mutation influences the gene-expression profile of the cancer.

...read moreread less

Proceedings ArticleDOI

Information-theoretic co-clustering

Inderjit S. Dhillon, +2 more

TL;DR: This work presents an innovative co-clustering algorithm that monotonically increases the preserved mutual information by intertwining both the row and column clusterings at all stages and demonstrates that the algorithm works well in practice, especially in the presence of sparsity and high-dimensionality.

...read moreread less

Journal ArticleDOI

An Information-Intensive Approach to the Molecular Pharmacology of Cancer

John N. Weinstein, +20 more

- 17 Jan 1997 -

Science

TL;DR: Information is being used to search for candidate anticancer drugs that are not dependent on intact p53 suppressor gene function for their activity, and it remains to be seen how effective this information-intensive strategy will be at generating new clinically active agents.

...read moreread less

Journal ArticleDOI

Direct Clustering of a Data Matrix

J. A. Hartigan

- 01 Mar 1972 -

Journal of the American Statistical Asso...

TL;DR: This article presents a model, and a technique, for clustering cases and variables simultaneously and the principal advantage in this approach is the direct interpretation of the clusters on the data.

...read moreread less