scispace - formally typeset
Search or ask a question
Topic

Multivariate mutual information

About: Multivariate mutual information is a research topic. Over the lifetime, 362 publications have been published within this topic receiving 16681 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: An information theoretic measure is derived that quantifies the statistical coherence between systems evolving in time and is able to distinguish effectively driving and responding elements and to detect asymmetry in the interaction of subsystems.
Abstract: An information theoretic measure is derived that quantifies the statistical coherence between systems evolving in time. The standard time delayed mutual information fails to distinguish information that is actually exchanged from shared information due to common history and input signals. In our new approach, these influences are excluded by appropriate conditioning of transition probabilities. The resulting transfer entropy is able to distinguish effectively driving and responding elements and to detect asymmetry in the interaction of subsystems.

3,653 citations

Journal ArticleDOI
TL;DR: This paper investigates the application of the mutual information criterion to evaluate a set of candidate features and to select an informative subset to be used as input data for a neural network classifier.
Abstract: This paper investigates the application of the mutual information criterion to evaluate a set of candidate features and to select an informative subset to be used as input data for a neural network classifier. Because the mutual information measures arbitrary dependencies between random variables, it is suitable for assessing the "information content" of features in complex classification tasks, where methods bases on linear relations (like the correlation) are prone to mistakes. The fact that the mutual information is independent of the coordinates chosen permits a robust estimation. Nonetheless, the use of the mutual information for tasks characterized by high input dimensionality requires suitable approximations because of the prohibitive demands on computation and samples. An algorithm is proposed that is based on a "greedy" selection of the features and that takes both the mutual information with respect to the output class and with respect to the already-selected features into account. Finally the results of a series of experiments are discussed. >

2,423 citations

Journal ArticleDOI
01 Oct 2002
TL;DR: The findings show that the algorithms used so far may be quite substantially improved upon when dealing with small datasets, finite sample effects and other sources of potentially misleading results have to be taken into account.
Abstract: Motivation: Clustering co-expressed genes usually requires the definition of ‘distance’ or ‘similarity’ between measured datasets, the most common choices being Pearson correlation or Euclidean distance. With the size of available datasets steadily increasing, it has become feasible to consider other, more general, definitions as well. One alternative, based on information theory, is the mutual information, providing a general measure of dependencies between variables. While the use of mutual information in cluster analysis and visualization of large-scale gene expression data has been suggested previously, the earlier studies did not focus on comparing different algorithms to estimate the mutual information from finite data. Results: Here we describe and review several approaches to estimate the mutual information from finite datasets. Our findings show that the algorithms used so far may be quite substantially improved upon. In particular when dealing with small datasets, finite sample effects and other sources of potentially misleading results have to be taken into account.

764 citations

Journal ArticleDOI
TL;DR: It is shown that sample transmitted information provides a simple method for measuring and testing association in multi-dimensional contingency tables and relations with analysis of variance are pointed out.
Abstract: A multivariate analysis based on transmitted information is presented. It is shown that sample transmitted information provides a simple method for measuring and testing association in multi-dimensional contingency tables. Relations with analysis of variance are pointed out, and statistical tests are described.

632 citations

Journal ArticleDOI
TL;DR: It is argued that equitability is properly formalized by a self-consistency condition closely related to Data Processing Inequality, and shown that estimating mutual information provides a natural and practical method for equitably quantifying associations in large datasets.
Abstract: How should one quantify the strength of association between two random variables without bias for relationships of a specific form? Despite its conceptual simplicity, this notion of statistical "equitability" has yet to receive a definitive mathematical formalization. Here we argue that equitability is properly formalized by a self-consistency condition closely related to Data Processing Inequality. Mutual information, a fundamental quantity in information theory, is shown to satisfy this equitability criterion. These findings are at odds with the recent work of Reshef et al. [Reshef DN, et al. (2011) Science 334(6062):1518-1524], which proposed an alternative definition of equitability and introduced a new statistic, the "maximal information coefficient" (MIC), said to satisfy equitability in contradistinction to mutual information. These conclusions, however, were supported only with limited simulation evidence, not with mathematical arguments. Upon revisiting these claims, we prove that the mathematical definition of equitability proposed by Reshef et al. cannot be satisfied by any (nontrivial) dependence measure. We also identify artifacts in the reported simulation evidence. When these artifacts are removed, estimates of mutual information are found to be more equitable than estimates of MIC. Mutual information is also observed to have consistently higher statistical power than MIC. We conclude that estimating mutual information provides a natural (and often practical) way to equitably quantify statistical associations in large datasets.

524 citations


Network Information
Related Topics (5)
Robustness (computer science)
94.7K papers, 1.6M citations
75% related
Cluster analysis
146.5K papers, 2.9M citations
73% related
Support vector machine
73.6K papers, 1.7M citations
73% related
Artificial neural network
207K papers, 4.5M citations
72% related
Optimization problem
96.4K papers, 2.1M citations
72% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202113
20209
20198
201812
201734
201623