scispace - formally typeset
Search or ask a question

Showing papers on "Multivariate mutual information published in 2004"


Journal ArticleDOI
TL;DR: The utilisation of mutual information as similarity measure enables the detection of non-linear correlations in gene expression datasets and frequently applied linear correlation measures, which are often used on an ad-hoc basis without further justification, are extended.
Abstract: The information theoretic concept of mutual information provides a general framework to evaluate dependencies between variables In the context of the clustering of genes with similar patterns of expression it has been suggested as a general quantity of similarity to extend commonly used linear measures Since mutual information is defined in terms of discrete variables, its application to continuous data requires the use of binning procedures, which can lead to significant numerical errors for datasets of small or moderate size In this work, we propose a method for the numerical estimation of mutual information from continuous data We investigate the characteristic properties arising from the application of our algorithm and show that our approach outperforms commonly used algorithms: The significance, as a measure of the power of distinction from random correlation, is significantly increased This concept is subsequently illustrated on two large-scale gene expression datasets and the results are compared to those obtained using other similarity measures A C++ source code of our algorithm is available for non-commercial use from kloska@scienionde upon request The utilisation of mutual information as similarity measure enables the detection of non-linear correlations in gene expression datasets Frequently applied linear correlation measures, which are often used on an ad-hoc basis without further justification, are thereby extended

297 citations


Journal ArticleDOI
TL;DR: An introduction to the connection between predictability and information theory is given, and new connections between these concepts are derived.
Abstract: This paper gives an introduction to the connection between predictability and information theory, and derives new connections between these concepts. A system is said to be unpredictable if the forecast distribution, which gives the most complete description of the future state based on all available knowledge, is identical to the climatological distribution, which describes the state in the absence of time lag information. It follows that a necessary condition for predictability is for the forecast and climatological distributions to differ. Information theory provides a powerful framework for quantifying the difference between two distributions that agrees with intuition about predictability. Three information theoretic measures have been proposed in the literature: predictive information, relative entropy, and mutual information. These metrics are discussed with the aim of clarifying their similarities and differences. All three metrics have attractive properties for defining predictability, i...

163 citations


01 Jan 2004
TL;DR: This paper presents a number of data analyses making use of the concept of mutual information, taken from the fields of sports, neuroscience, and forest science.
Abstract: This paper presents a number of data analyses making use of the concept of mutual information. Statistical uses of mutual information are seen to include: comparative studies, variable selection, estimation of pa rameters and assessment of model fit. The examples are taken from the fields of sports, neuroscience, and forest science. There is an Appendix providing proofs.

81 citations


01 Jan 2004
TL;DR: The theoretical foundations for a definition of distance on ontologies are laid out in this paper and the distance measure is defined using the well known concepts of entropy and mutual information from information theory.
Abstract: The theoretical foundations for a definition of distance on ontologies are laid out in this paper. The distance measure is defined using the well known concepts of entropy and mutual information from information theory. These formal methods are adapted for practical use in ontologies through the definition of useful discrete random variables. These include centrality measures like degree, closeness or betweeness.

26 citations


Journal ArticleDOI
TL;DR: In this article, the authors show that most empirical evidences about market behaviors documented in the literature can be explained by a new information theory generalized from Shannon's entropy theory of information.
Abstract: Some recent empirical works indicate that investor performance and market patterns are primarily information driven instead of a behavioral phenomenon. However, Grossman and Stiglitz information theory and its variations offer little guidance in identifying informed investors and in distinguishing between securities with scarce information and those with widely available information. We show that most empirical evidences about market behaviors documented in the literature can be explained by a new information theory generalized from Shannon's entropy theory of information. Investor performance and market patterns are the results of information processing by investors of different sizes with different background knowledge.

17 citations


Proceedings ArticleDOI
12 May 2004
TL;DR: The proposed measure strictly follows information theory in contrast to a number of heuristic methods that were proposed to include spatial information in mutual information, and solves the problem of efficient estimation of multi-feature mutual information from sparse high-dimensional histograms.
Abstract: In the last decade information-theoretic similarity measures, especially mutual information and its derivatives, have proven to be accurate measures for rigid and non-rigid, mono- and multi-modal image registration. However, these measures are sometimes not robust enough, especially in cases of poor image quality. This is most likely due to the lack of spatial information included in the measure as usually only intensities are employed to measure similarity between images. Spatial information in the form of intensity gradients or second derivatives may be included in information-theoretic similarity measures. This paper presents a novel method for efficiently combining multiple features into the estimation of mutual information. The proposed measure, under certain assumptions on feature probability distribution, strictly follows information theory in contrast to a number of heuristic methods that were proposed to include spatial information in mutual information. The novel approach solves the problem of efficient estimation of multi-feature mutual information from sparse high-dimensional histograms. The proposed measure was tested on widely used Vanderbilt image database. Results indicate that multi-feature mutual information outperforms the single-feature mutual information measure. The contribution of additional image features to registration is especially significant in cases when standard mutual information measure fails. Moreover, it is expected that non-rigid registration may also benefit from the proposed multi-feature mutual information measure.

11 citations


Proceedings ArticleDOI
02 Nov 2004
TL;DR: In this article, a multivariate extension of the mutual information measure was proposed to improve the success rate of automated registration by making use of the information available in multiple images rather than a single pair.
Abstract: In multispectral imaging, automated cross-spectral (band-to-band) image registration is difficult to achieve with a reliability approaching 100%. This is particularly true when registering infrared to visible imagery, where contrast reversals are common and similarity is often lacking. Algorithms that use mutual information as a similarity measure have been shown to work well in the presence of contrast reversal. However, weak similarity between the long-wave infrared (LWIR) bands and shorter wavelengths remains a problem. A method is presented in this paper for registering multiple images simultaneously rather than one pair at a time using a multivariate extension of the mutual information measure. This approach improves the success rate of automated registration by making use of the information available in multiple images rather than a single pair. This approach is further enhanced by including a cyclic consistency check, for example registering band A to B, B to C, and C to A. The cyclic consistency check provides an automated measure of success allowing a different combination of bands to be used in the event of a failure. Experiments were conducted using imagery from the Department of Energy’s Multispectral Thermal Imager satellite. The results show a significantly improved success rate.

4 citations


01 Jan 2004
TL;DR: This work describes a principled bound maximisation procedure for Mutual Information learning of population codes in a simple point neural model, and compares it with other approaches.
Abstract: The goal of neural processing assemblies is varied, and in many cases still rather unclear. However, a possibly reasonable subgoal is that sensory information may be encoded efficiently in a population of neurons. In this context, Mutual Information is a long studied measure of coding efficiency, and many attempts to apply this to {\em population coding} have been made. However, this is a numerically intractable task, and most previous studies redefine the criterion in forms of an approximation to Mutual Information, the Fisher Information being one such well-known approach. Here we describe a principled bound maximisation procedure for Mutual Information learning of population codes in a simple point neural model, and compare it with other approaches.

1 citations


01 Jan 2004
TL;DR: A hierarchy of information-like functions, here named the information correlation functions, where each function of the hierarchy may be thought of as the information between the variables it depends upon, is described.
Abstract: The topic of this paper is a hierarchy of information-like functions, here named the information correlation functions, where each function of the hierarchy may be thought of as the information between the variables it depends upon. The information correlation functions are particularly suited to the description of the emergence of complex behaviors due to many- body or many-agent processes. They are particularly well suited to the quantification of the decomposition of the information carried among a set of variables or agents, and its subsets. In more graphical language, they provide the information theoretic basis for understanding the synergistic and non-synergistic components of a system, and as such should serve as a forceful toolkit for the analysis of the complexity structure of complex many agent systems. The information correlation functions are the natural generalization to an arbitrary number of sets of variables of the sequence starting with the entropy function (one set of variables) and the mutual information function (two sets). We start by describing the traditional measures of information (entropy) and mutual information.