Maximal Quasi-Bicliques with Balanced Noise Tolerance: Concepts and Co-clustering Applications.

Open AccessProceedings Article

Maximal Quasi-Bicliques with Balanced Noise Tolerance: Concepts and Co-clustering Applications.

Jinyan Li, +3 more

- pp 72-83

Chats0

TLDR

Noise tolerance of maximal quasi-bicliques is improved by allowing every vertex to tolerate up to the same number, or the same percentage, of missing edges to lead to a more natural interaction between the two vertex sets— a balanced most-versus-most adjacency.

Abstract:

The rigid all-versus-all adjacency required by a maximal biclique for its two vertex sets is extremely vulnerable to missing data In the past, several types of quasi-bicliques have been proposed to tackle this problem, however their noise tolerance is usually unbalanced and can be very skewed In this paper, we improve the noise tolerance of maximal quasi-bicliques by allowing every vertex to tolerate up to the same number, or the same percentage, of missing edges This idea leads to a more natural interaction between the two vertex sets— a balanced most-versus-most adjacency This generalization is also non-trivial, as many large-size maximal quasi-biclique subgraphs do not contain any maximal bicliques This observation implies that direct expansion from maximal bicliques may not guarantee a complete enumeration of all maximal quasi-bicliques We present important properties of maximal quasi-bicliques such as a bounded closure property and a fixed point property to design efficient algorithms Maximal quasi-bicliques are closely related to co-clustering problems such as documents and words co-clustering, images and features coclustering, stocks and financial ratios co-clustering, etc Here, we demonstrate the usefulness of our concepts using a new application—a bioinformatics example— where prediction of true protein interactions is investigated

Citations

PDF

Open Access

More filters

Book ChapterDOI

Exploiting indirect neighbours and topological weight to predict protein function from protein-protein interactions

Hon Nian Chua, +2 more

TL;DR: An algorithm is developed that predicts the functions of a protein in two steps: assigning a weight to each of its level-1 and level-2 neighbours by estimating its functional similarity with the protein using the local topology of the interaction network as well as the reliability of experimental sources and scoring each function based on its weighted frequency in these neighbours.

...read moreread less

Book ChapterDOI

A Survey of Algorithms for Dense Subgraph Discovery

Victor E. Lee, +3 more

TL;DR: This chapter will discuss and organize the literature on this topic effectively in order to make it much more accessible to the reader.

...read moreread less

Journal ArticleDOI

A survey on enhanced subspace clustering

Kelvin Sim, +3 more

- 01 Mar 2013 -

Data Mining and Knowledge Discovery

TL;DR: This survey presents enhanced approaches to subspace clustering by discussing the problems they are solving, their cluster definitions and algorithms, and the related works in high-dimensional clustering.

...read moreread less

Journal ArticleDOI

Selecting informative subsets of sparse supermatrices increases the chance to find correct trees

Bernhard Misof, +5 more

- 03 Dec 2013 -

BMC Bioinformatics

TL;DR: Analysis of simulated and empirical data demonstrate that sparse supermatrices can be reduced on a formal basis outperforming the usually used simple selections of taxa and genes with high data coverage.

...read moreread less

Proceedings ArticleDOI

Efficient (α, β)-core Computation: an Index-based Approach

Boge Liu, +5 more

TL;DR: This paper presents an efficient algorithm based on a novel index such that the algorithm runs in linear time regarding the result size and proves that the index only requires O(m) space where m is the number of edges in the bipartite graph.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Co-clustering documents and words using bipartite spectral graph partitioning

Inderjit S. Dhillon

TL;DR: A new spectral co-clustering algorithm is used that uses the second left and right singular vectors of an appropriately scaled word-document matrix to yield good bipartitionings and it can be shown that the singular vectors solve a real relaxation to the NP-complete graph bipartitionsing problem.

...read moreread less

Journal ArticleDOI

MIPS: a database for genomes and protein sequences

Hans-Werner Mewes, +9 more

- 01 Jan 1999 -

Nucleic Acids Research

TL;DR: This report describes the systematic and up-to-date analysis of genomes (PEDANT), a comprehensive database of the yeast genome (MYGD), a database reflecting the progress in sequencing the Arabidopsis thaliana genome (MATD), the database of assembled, annotated human EST clusters (MEST), and the collection of protein sequence data within the framework of the PIR-International Protein Sequence Database (described elsewhere in this volume).

...read moreread less

Proceedings ArticleDOI

Information-theoretic co-clustering

Inderjit S. Dhillon, +2 more

TL;DR: This work presents an innovative co-clustering algorithm that monotonically increases the preserved mutual information by intertwining both the row and column clusterings at all stages and demonstrates that the algorithm works well in practice, especially in the presence of sparsity and high-dimensionality.

...read moreread less

Journal ArticleDOI

Efficient mining of association rules using closed itemset lattices

Nicolas Pasquier, +3 more

- 01 Mar 1999 -

Information Systems

TL;DR: Experiments showed that Close is very efficient for mining dense and/or correlated data such as census style data, and performs reasonably well for market basket style data.

...read moreread less

Journal ArticleDOI

Topological structure analysis of the protein–protein interaction network in budding yeast

Dongbo Bu, +11 more

- 01 May 2003 -

Nucleic Acids Research

TL;DR: A spectral method derived from graph theory was introduced to uncover hidden topological structures (i.e. quasi-cliques and quasi-bipartites) of complicated protein-protein interaction networks and suggest that they consist of biologically relevant functional groups.

...read moreread less

Collapse

Maximal Quasi-Bicliques with Balanced Noise Tolerance: Concepts and Co-clustering Applications.

Citations

Exploiting indirect neighbours and topological weight to predict protein function from protein-protein interactions

A Survey of Algorithms for Dense Subgraph Discovery

A survey on enhanced subspace clustering

Selecting informative subsets of sparse supermatrices increases the chance to find correct trees

Efficient (α, β)-core Computation: an Index-based Approach

References

Co-clustering documents and words using bipartite spectral graph partitioning

MIPS: a database for genomes and protein sequences

Information-theoretic co-clustering

Efficient mining of association rules using closed itemset lattices

Topological structure analysis of the protein–protein interaction network in budding yeast

Related Papers (5)

Biclustering of Expression Data

Biclustering Algorithms for Biological Data Analysis: A Survey

The maximum edge biclique problem is NP-complete

Discovering large dense subgraphs in massive graphs

Topological structure analysis of the protein–protein interaction network in budding yeast