Topic

# Conditional mutual information

About: Conditional mutual information is a research topic. Over the lifetime, 993 publications have been published within this topic receiving 62781 citations.

##### Papers published on a yearly basis

##### Papers

More filters

••

TL;DR: An information theoretic measure is derived that quantifies the statistical coherence between systems evolving in time and is able to distinguish effectively driving and responding elements and to detect asymmetry in the interaction of subsystems.

Abstract: An information theoretic measure is derived that quantifies the statistical coherence between systems evolving in time. The standard time delayed mutual information fails to distinguish information that is actually exchanged from shared information due to common history and input signals. In our new approach, these influences are excluded by appropriate conditioning of transition probabilities. The resulting transfer entropy is able to distinguish effectively driving and responding elements and to detect asymmetry in the interaction of subsystems.

3,653 citations

••

24 Aug 2003

TL;DR: This work presents an innovative co-clustering algorithm that monotonically increases the preserved mutual information by intertwining both the row and column clusterings at all stages and demonstrates that the algorithm works well in practice, especially in the presence of sparsity and high-dimensionality.

Abstract: Two-dimensional contingency or co-occurrence tables arise frequently in important applications such as text, web-log and market-basket data analysis. A basic problem in contingency table analysis is co-clustering: simultaneous clustering of the rows and columns. A novel theoretical formulation views the contingency table as an empirical joint probability distribution of two discrete random variables and poses the co-clustering problem as an optimization problem in information theory---the optimal co-clustering maximizes the mutual information between the clustered random variables subject to constraints on the number of row and column clusters. We present an innovative co-clustering algorithm that monotonically increases the preserved mutual information by intertwining both the row and column clusterings at all stages. Using the practical example of simultaneous word-document clustering, we demonstrate that our algorithm works well in practice, especially in the presence of sparsity and high-dimensionality.

1,203 citations

•

TL;DR: It is shown that this feature selection method outperforms other classical algorithms, and that a naive Bayesian classifier built with features selected that way achieves error rates similar to those of state-of-the-art methods such as boosting or SVMs.

Abstract: We propose in this paper a very fast feature selection technique based on conditional mutual information. By picking features which maximize their mutual information with the class to predict conditional to any feature already picked, it ensures the selection of features which are both individually informative and two-by-two weakly dependant. We show that this feature selection method outperforms other classical algorithms, and that a naive Bayesian classifier built with features selected that way achieves error rates similar to those of state-of-the-art methods such as boosting or SVMs. The implementation we propose selects 50 features among 40,000, based on a training set of 500 examples in a tenth of a second on a standard 1Ghz PC.

1,018 citations

••

01 Oct 2002TL;DR: The findings show that the algorithms used so far may be quite substantially improved upon when dealing with small datasets, finite sample effects and other sources of potentially misleading results have to be taken into account.

Abstract: Motivation: Clustering co-expressed genes usually requires the definition of ‘distance’ or ‘similarity’ between measured datasets, the most common choices being Pearson correlation or Euclidean distance. With the size of available datasets steadily increasing, it has become feasible to consider other, more general, definitions as well. One alternative, based on information theory, is the mutual information, providing a general measure of dependencies between variables. While the use of mutual information in cluster analysis and visualization of large-scale gene expression data has been suggested previously, the earlier studies did not focus on comparing different algorithms to estimate the mutual information from finite data. Results: Here we describe and review several approaches to estimate the mutual information from finite datasets. Our findings show that the algorithms used so far may be quite substantially improved upon. In particular when dealing with small datasets, finite sample effects and other sources of potentially misleading results have to be taken into account.

764 citations