scispace - formally typeset
Search or ask a question

Showing papers on "Multivariate mutual information published in 2008"


Journal ArticleDOI
TL;DR: This work investigates an alternative measure of dependence: the lautum information defined as the divergence between the product-of-marginal and joint distributions, i.e., swapping the arguments in the definition of mutual information.
Abstract: A popular way to measure the degree of dependence between two random objects is by their mutual information, defined as the divergence between the joint and product-of-marginal distributions. We investigate an alternative measure of dependence: the lautum information defined as the divergence between the product-of-marginal and joint distributions, i.e., swapping the arguments in the definition of mutual information. Some operational characterizations and properties are provided for this alternative measure of information.

63 citations


Journal ArticleDOI
TL;DR: This paper addresses problems of fuzzy rule-based algorithms for preprocessing databases with incomplete and imprecise data, and proposes an extended definition of the mutual information between two fuzzified continuous variables, and introduces a numerical algorithm for estimating the mutual Information from a sample of vague data.

51 citations


Proceedings ArticleDOI
01 Dec 2008
TL;DR: This work provides for the first time a formulation of non-linear, nonstationary causality in terms of mutual information and develops asymptotic relations that emerge under strict stationarity and generalise earlier work of Geweke.
Abstract: We provide for the first time a formulation of non-linear, nonstationary causality in terms of mutual information. We provide two fundamental mutual information identities; the first relating Granger type causality to Sims type causality. The second providing a decomposition of mutual information into a sum of Granger and Sims type terms. We also develop asymptotic relations that emerge under strict stationarity and generalise earlier work of Geweke. We relate our work in general with earlier developments.

49 citations


Book ChapterDOI
15 Dec 2008
TL;DR: A general criterion function for feature selector using mutual information is introduced, which can bring up-to-date selectors based on mutual information together under an unifying scheme and an experimental comparative study of eight typical filter mutual information based feature selection algorithms on thirty-three datasets is presented.
Abstract: In real-world application, data is often represented by hundreds or thousands of features. Most of them, however, are redundant or irrelevant, and their existence may straightly lead to poor performance of learning algorithms. Hence, it is a compelling requisition for their practical applications to choose most salient features. Currently, a large number of feature selection methods using various strategies have been proposed. Among these methods, the mutual information ones have recently gained much more popularity. In this paper, a general criterion function for feature selector using mutual information is firstly introduced. This function can bring up-to-date selectors based on mutual information together under an unifying scheme. Then an experimental comparative study of eight typical filter mutual information based feature selection algorithms on thirty-three datasets is presented. We evaluate them from four essential aspects, and the experimental results show that none of these methods outperforms others significantly. Even so, the conditional mutual information feature selection algorithm dominates other methods on the whole, if training time is not a matter.

40 citations


Journal ArticleDOI
TL;DR: This paper studies generic classification problems, which include a rejected, or unknown, class, and presents the basic formulas and schematic diagram of classification learning based on information theory.

27 citations


Journal ArticleDOI
TL;DR: This paper considers the problem of redundancy, complementarity, and consistency of information sources, under the assumption that the information elements of each source are related in a lattice structure and proposes various methods for integrating information sources and establish relationships between these methods.

22 citations


Posted Content
TL;DR: A class of measures to quantify the contextual nature of the information in sets of objects, based on Kolmogorov's intrinsic complexity are proposed, which discount both random and redundant information and are inherent in that they do not require a defined state space to quantify the information.
Abstract: It is not obvious what fraction of all the potential information residing in the molecules and structures of living systems is significant or meaningful to the system. Sets of random sequences or identically repeated sequences, for example, would be expected to contribute little or no useful information to a cell. This issue of quantitation of information is important since the ebb and flow of biologically significant information is essential to our quantitative understanding of biological function and evolution. Motivated specifically by these problems of biological information, we propose here a class of measures to quantify the contextual nature of the information in sets of objects, based on Kolmogorov's intrinsic complexity. Such measures discount both random and redundant information and are inherent in that they do not require a defined state space to quantify the information. The maximization of this new measure, which can be formulated in terms of the universal information distance, appears to have several useful and interesting properties, some of which we illustrate with examples.

22 citations


Journal ArticleDOI
TL;DR: In this article, the authors use a unique data set of observed email content from 1382 executive recruiting teams and detailed accounting data on their productivity to examine both the antecedents and performance effects of shared versus diverse information and find clear evidence of an inverted-U shaped relationship between mutual information and team productivity.
Abstract: A tension exists between two well-established streams of literature on the performance of teams. One stream contends that teams with diverse backgrounds, social structures, knowledge, and experience function more effectively because they bring novel information to bear on problems that cannot be solved by groups of homogeneous individuals. In contrast, the literature on mutual knowledge contends that shared information and experience is essential to effective communication, trust, understanding and coordination among team members. Furthermore, several distinct antecedents of mutual information and knowledge have been hypothesized, making it difficult to manage information overlap in teams. In this paper, we use a unique data set of observed email content from 1382 executive recruiting teams and detailed accounting data on their productivity to examine both the antecedents and performance effects of shared versus diverse information. We find clear evidence of an inverted-U shaped relationship between mutual information and team productivity. A significant amount of information overlap among team members is associated with higher performance while extremes of too little or too much mutual information hamper performance. We also find that geographic dispersion and social network distance are strong predictors of mutual knowledge failures, while demographic dissimilarity and organizational distance do not predict the degree of mutual information in our data. Our work helps bring together the divergent streams of literature on mutual knowledge, information diversity, and the management of team performance.

10 citations


Proceedings Article
13 Jul 2008
TL;DR: An approximation of mutual information which is based on a soft extension of d-separation (a graphical test of independence in Bayesian networks) is proposed, focused primarily on polytree networks, which are sufficient for the application the authors consider, although there are potential extensions of the approximation to general networks as well.
Abstract: We consider the problem of computing mutual information between many pairs of variables in a Bayesian network. This task is relevant to a new class of Generalized Belief Propagation (GBP) algorithms that characterizes Iterative Belief Propagation (IBP) as a polytree approximation found by deleting edges in a Bayesian network. By computing, in the simplified network, the mutual information between variables across a deleted edge, we can estimate the impact that recovering the edge might have on the approximation. Unfortunately, it is computationally impractical to compute such scores for networks over many variables having large state spaces. So that edge recovery can scale to such networks, we propose in this paper an approximation of mutual information which is based on a soft extension of d-separation (a graphical test of independence in Bayesian networks). We focus primarily on polytree networks, which are sufficient for the application we consider, although we discuss potential extensions of the approximation to general networks as well. Empirically, we show that our proposal is often as effective as mutual information for the task of edge recovery, with orders of magnitude savings in computation time in larger networks. Our results lead to a concrete realization of GBP, admitting improvements to IBP approximations with only a modest amount of computational effort.

7 citations


Journal ArticleDOI
TL;DR: A method based on the mutual information approach is developed that evaluates the information content of communication between interacting individuals through correlations of their behavior patterns, predicting that correlated interactions of the indirect reciprocity type together with affective behavior and selection rules changing with time are necessary conditions for the emergence of significant information exchange.

5 citations



Proceedings Article
03 Jul 2008
TL;DR: An important improvement related to the computation and use of Mutual Information index in Pseudobagging, a technique that adapts “bagging” to unsupervised context is presented and the use of such an index is proposed to improve the PseudOBagging voting scheme for determining the final partition of the data.
Abstract: We present an important improvement related to the computation and use of Mutual Information index in Pseudobagging, a technique that adapts “bagging” to unsupervised context The Mutual Information index plays a key role in this technique, assessing the quality of a partition We propose the use of such an index to improve the Pseudobagging voting scheme for determining the final partition of the data Issues related to the estimation of Mutual Information index in the multivariate continuous case become crucial for the application of Pseudobagging to real data: we discuss some practical approaches to computation in this situation Finally, experimental results are presented, related to application of new “pooled voting” scheme and to the evaluation of the impact of different computing methods for Mutual Information