TL;DR: This paper presents an approach to localise this embedded common information through an indirect method of estimating mutual information between all signal sources, which leads to faster convergence with much reduced spurious associations.

Abstract: Relating information originating from disparate sensors observing a given scene is a challenging task, particularly when an appropriate model of the environment or the behaviour of any particular object within it is not available. One possible strategy to address this task is to examine whether the sensor outputs contain information which can be attributed to a common cause. In this paper, we present an approach to localise this embedded common information through an indirect method of estimating mutual information between all signal sources. Ability of L 1 regularization to enforce sparseness of the solution is exploited to identify a subset of signals that are related to each other, from among a large number of sensor outputs. As opposed to the conventional L 2 regularization, the proposed method leads to faster convergence with much reduced spurious associations. Simulation and experimental results are presented to validate the findings.

TL;DR: A novel high-resolution talking face generation model for arbitrary person is proposed by discovering the cross-modality coherence via Mutual Information Approximation (MIA) by assuming the modality difference between audio and video is larger that of real video and generated video.

Abstract: Given an arbitrary speech clip and a facial image, talking face generation aims to synthesize a talking face video with precise lip synchronization as well as a smooth transition of facial motion over the entire video speech. Most existing methods mainly focus on either disentangling the information in a single image or learning temporal information between frames. However, speech audio and video often have cross-modality coherence that has not been well addressed during synthesis. Therefore, this paper proposes a novel high-resolution talking face generation model for arbitrary person by discovering the cross-modality coherence via Mutual Information Approximation (MIA). By assuming the modality difference between audio and video is larger that of real video and generated video, we estimate mutual information between real audio and video, and then use a discriminator to enforce generated video distribution approach real video distribution. Furthermore, we introduce a dynamic attention technique on the mouth to enhance the robustness during the training stage. Experimental results on benchmark dataset LRW transcend the state-of-the-art methods on prevalent metrics with robustness on gender, pose variations and high-resolution synthesizing.

24 citations

Cites background from "Cross-modal localization through mu..."

...As a quantity for capturing non-linear statistical dependencies between variables, it has found applications in a wide range of domains and tasks, including clustering gene expression data [28], feature selection [30] and cross-modality localization [1]....

TL;DR: A novel arbitrary talking face generation framework is proposed by discovering the audio-visual coherence via the proposed Asymmetric Mutual Information Estimator (AMIE) and a Dynamic Attention (DA) block by selectively focusing the lip area of the input image during the training stage, to further enhance lip synchronization.

Abstract: Talking face generation aims to synthesize a face video with precise lip synchronization as well as a smooth transition of facial motion over the entire video via the given speech clip and facial image. Most existing methods mainly focus on either disentangling the information in a single image or learning temporal information between frames. However, cross-modality coherence between audio and video information has not been well addressed during synthesis. In this paper, we propose a novel arbitrary talking face generation framework by discovering the audio-visual coherence via the proposed Asymmetric Mutual Information Estimator (AMIE). In addition, we propose a Dynamic Attention (DA) block by selectively focusing the lip area of the input image during the training stage, to further enhance lip synchronization. Experimental results on benchmark LRW dataset and GRID dataset transcend the state-of-the-art methods on prevalent metrics with robust high-resolution synthesizing on gender and pose variations.

TL;DR: This paper presents a novel approach to 2D street map-based localization for mobile robots that navigate mainly in urban sidewalk environments and employs a computationally efficient estimator of Squared-loss Mutual Information through which it achieved near real-time performance.

Abstract: In this paper, we present a novel approach to 2D street map-based localization for mobile robots that navigate mainly in urban sidewalk environments. Recently, localization based on the map built by Simultaneous Localization and Mapping (SLAM) has been widely used with great success. However, such methods limit robot navigation to environments whose maps are prebuilt. In other words, robots cannot navigate in environments that they have not previously visited. We aim to relax the restriction by employing existing 2D street maps for localization. Finding an exact match between sensor data and a street map is challenging because, unlike maps built by robots, street maps lack detailed information about the environment (such as height and color). Our approach to coping with this difficulty is to maximize statistical dependence between sensor data and the map, and localization is achieved through maximization of a Mutual Information-based criterion. Our method employs a computationally efficient estimator of Squared-loss Mutual Information through which we achieved near real-time performance. The effectiveness of our method is evaluated through localization experiments using real-world data sets.

5 citations

Cites methods from "Cross-modal localization through mu..."

...Although MI has already been used to register multimodal sensor data [14] [15] [16], to our knowledge, it has never been applied to street map-based localization....

TL;DR: This paper proposes a novel localization approach that can be applied to sidewalks based on existing 2D street maps and employs a computationally efficient estimator of squared-loss mutual information, through which it achieves near real-time performance.

Abstract: Recently, localization methods based on detailed maps constructed using simultaneous localization and mapping have been widely used for mobile robot navigation. However, the cost of building such maps increases rapidly with expansion of the target environment. Here, we consider the problem of localization of a mobile robot based on existing 2D street maps. Although a large amount of research on this topic has been reported, the majority of the previous studies have focused on car-like vehicles that navigate on roadways; thus, the efficacy of such methods for sidewalks is not yet known. In this paper, we propose a novel localization approach that can be applied to sidewalks. Whereas roadways are typically marked, e.g. by white lines, sidewalks are not and, therefore, road boundary detection is not straightforward. Thus, obtaining exact correspondence between sensor data and a street map is complex. Our approach to overcoming this difficulty is to maximize the statistical dependence between the sensor data ...

TL;DR: The author examines the role of entropy, inequality, and randomness in the design of codes and the construction of codes in the rapidly changing environment.

Abstract: Preface to the Second Edition. Preface to the First Edition. Acknowledgments for the Second Edition. Acknowledgments for the First Edition. 1. Introduction and Preview. 1.1 Preview of the Book. 2. Entropy, Relative Entropy, and Mutual Information. 2.1 Entropy. 2.2 Joint Entropy and Conditional Entropy. 2.3 Relative Entropy and Mutual Information. 2.4 Relationship Between Entropy and Mutual Information. 2.5 Chain Rules for Entropy, Relative Entropy, and Mutual Information. 2.6 Jensen's Inequality and Its Consequences. 2.7 Log Sum Inequality and Its Applications. 2.8 Data-Processing Inequality. 2.9 Sufficient Statistics. 2.10 Fano's Inequality. Summary. Problems. Historical Notes. 3. Asymptotic Equipartition Property. 3.1 Asymptotic Equipartition Property Theorem. 3.2 Consequences of the AEP: Data Compression. 3.3 High-Probability Sets and the Typical Set. Summary. Problems. Historical Notes. 4. Entropy Rates of a Stochastic Process. 4.1 Markov Chains. 4.2 Entropy Rate. 4.3 Example: Entropy Rate of a Random Walk on a Weighted Graph. 4.4 Second Law of Thermodynamics. 4.5 Functions of Markov Chains. Summary. Problems. Historical Notes. 5. Data Compression. 5.1 Examples of Codes. 5.2 Kraft Inequality. 5.3 Optimal Codes. 5.4 Bounds on the Optimal Code Length. 5.5 Kraft Inequality for Uniquely Decodable Codes. 5.6 Huffman Codes. 5.7 Some Comments on Huffman Codes. 5.8 Optimality of Huffman Codes. 5.9 Shannon-Fano-Elias Coding. 5.10 Competitive Optimality of the Shannon Code. 5.11 Generation of Discrete Distributions from Fair Coins. Summary. Problems. Historical Notes. 6. Gambling and Data Compression. 6.1 The Horse Race. 6.2 Gambling and Side Information. 6.3 Dependent Horse Races and Entropy Rate. 6.4 The Entropy of English. 6.5 Data Compression and Gambling. 6.6 Gambling Estimate of the Entropy of English. Summary. Problems. Historical Notes. 7. Channel Capacity. 7.1 Examples of Channel Capacity. 7.2 Symmetric Channels. 7.3 Properties of Channel Capacity. 7.4 Preview of the Channel Coding Theorem. 7.5 Definitions. 7.6 Jointly Typical Sequences. 7.7 Channel Coding Theorem. 7.8 Zero-Error Codes. 7.9 Fano's Inequality and the Converse to the Coding Theorem. 7.10 Equality in the Converse to the Channel Coding Theorem. 7.11 Hamming Codes. 7.12 Feedback Capacity. 7.13 Source-Channel Separation Theorem. Summary. Problems. Historical Notes. 8. Differential Entropy. 8.1 Definitions. 8.2 AEP for Continuous Random Variables. 8.3 Relation of Differential Entropy to Discrete Entropy. 8.4 Joint and Conditional Differential Entropy. 8.5 Relative Entropy and Mutual Information. 8.6 Properties of Differential Entropy, Relative Entropy, and Mutual Information. Summary. Problems. Historical Notes. 9. Gaussian Channel. 9.1 Gaussian Channel: Definitions. 9.2 Converse to the Coding Theorem for Gaussian Channels. 9.3 Bandlimited Channels. 9.4 Parallel Gaussian Channels. 9.5 Channels with Colored Gaussian Noise. 9.6 Gaussian Channels with Feedback. Summary. Problems. Historical Notes. 10. Rate Distortion Theory. 10.1 Quantization. 10.2 Definitions. 10.3 Calculation of the Rate Distortion Function. 10.4 Converse to the Rate Distortion Theorem. 10.5 Achievability of the Rate Distortion Function. 10.6 Strongly Typical Sequences and Rate Distortion. 10.7 Characterization of the Rate Distortion Function. 10.8 Computation of Channel Capacity and the Rate Distortion Function. Summary. Problems. Historical Notes. 11. Information Theory and Statistics. 11.1 Method of Types. 11.2 Law of Large Numbers. 11.3 Universal Source Coding. 11.4 Large Deviation Theory. 11.5 Examples of Sanov's Theorem. 11.6 Conditional Limit Theorem. 11.7 Hypothesis Testing. 11.8 Chernoff-Stein Lemma. 11.9 Chernoff Information. 11.10 Fisher Information and the Cram-er-Rao Inequality. Summary. Problems. Historical Notes. 12. Maximum Entropy. 12.1 Maximum Entropy Distributions. 12.2 Examples. 12.3 Anomalous Maximum Entropy Problem. 12.4 Spectrum Estimation. 12.5 Entropy Rates of a Gaussian Process. 12.6 Burg's Maximum Entropy Theorem. Summary. Problems. Historical Notes. 13. Universal Source Coding. 13.1 Universal Codes and Channel Capacity. 13.2 Universal Coding for Binary Sequences. 13.3 Arithmetic Coding. 13.4 Lempel-Ziv Coding. 13.5 Optimality of Lempel-Ziv Algorithms. Compression. Summary. Problems. Historical Notes. 14. Kolmogorov Complexity. 14.1 Models of Computation. 14.2 Kolmogorov Complexity: Definitions and Examples. 14.3 Kolmogorov Complexity and Entropy. 14.4 Kolmogorov Complexity of Integers. 14.5 Algorithmically Random and Incompressible Sequences. 14.6 Universal Probability. 14.7 Kolmogorov complexity. 14.9 Universal Gambling. 14.10 Occam's Razor. 14.11 Kolmogorov Complexity and Universal Probability. 14.12 Kolmogorov Sufficient Statistic. 14.13 Minimum Description Length Principle. Summary. Problems. Historical Notes. 15. Network Information Theory. 15.1 Gaussian Multiple-User Channels. 15.2 Jointly Typical Sequences. 15.3 Multiple-Access Channel. 15.4 Encoding of Correlated Sources. 15.5 Duality Between Slepian-Wolf Encoding and Multiple-Access Channels. 15.6 Broadcast Channel. 15.7 Relay Channel. 15.8 Source Coding with Side Information. 15.9 Rate Distortion with Side Information. 15.10 General Multiterminal Networks. Summary. Problems. Historical Notes. 16. Information Theory and Portfolio Theory. 16.1 The Stock Market: Some Definitions. 16.2 Kuhn-Tucker Characterization of the Log-Optimal Portfolio. 16.3 Asymptotic Optimality of the Log-Optimal Portfolio. 16.4 Side Information and the Growth Rate. 16.5 Investment in Stationary Markets. 16.6 Competitive Optimality of the Log-Optimal Portfolio. 16.7 Universal Portfolios. 16.8 Shannon-McMillan-Breiman Theorem (General AEP). Summary. Problems. Historical Notes. 17. Inequalities in Information Theory. 17.1 Basic Inequalities of Information Theory. 17.2 Differential Entropy. 17.3 Bounds on Entropy and Relative Entropy. 17.4 Inequalities for Types. 17.5 Combinatorial Bounds on Entropy. 17.6 Entropy Rates of Subsets. 17.7 Entropy and Fisher Information. 17.8 Entropy Power Inequality and Brunn-Minkowski Inequality. 17.9 Inequalities for Determinants. 17.10 Inequalities for Ratios of Determinants. Summary. Problems. Historical Notes. Bibliography. List of Symbols. Index.

42,928 citations

"Cross-modal localization through mu..." refers background in this paper

...Mutual information between two high dimensional signals X1 and X2 can be indirectly estimated by mapping the signals into a lower dimensional space, by exploiting the data processing inequality [6] that defines lower bounds on mutual information....

TL;DR: A novel method for sparse signal recovery that in many situations outperforms ℓ1 minimization in the sense that substantially fewer measurements are needed for exact recovery.

Abstract: It is now well understood that (1) it is possible to reconstruct sparse signals exactly from what appear to be highly incomplete sets of linear measurements and (2) that this can be done by constrained l1 minimization. In this paper, we study a novel method for sparse signal recovery that in many situations outperforms l1 minimization in the sense that substantially fewer measurements are needed for exact recovery. The algorithm consists of solving a sequence of weighted l1-minimization problems where the weights used for the next iteration are computed from the value of the current solution. We present a series of experiments demonstrating the remarkable performance and broad applicability of this algorithm in the areas of sparse signal recovery, statistical estimation, error correction and image processing. Interestingly, superior gains are also achieved when our method is applied to recover signals with assumed near-sparsity in overcomplete representations—not by reweighting the l1 norm of the coefficient sequence as is common, but by reweighting the l1 norm of the transformed object. An immediate consequence is the possibility of highly efficient data acquisition protocols by improving on a technique known as Compressive Sensing.

4,511 citations

"Cross-modal localization through mu..." refers background in this paper

...L1 norm has found extensive use recently in solving convex optimisation problems from arbitrary signals estimated from incomplete set of measurements corrupted by noise [5] and also exhibits a very useful property, which is the preservation of the sparsity of the relationship between the multidimensional random variables....

TL;DR: An overview is presented of the medical image processing literature on mutual-information-based registration, an introduction for those new to the field, an overview for those working in the field and a reference for those searching for literature on a specific application.

Abstract: An overview is presented of the medical image processing literature on mutual-information-based registration. The aim of the survey is threefold: an introduction for those new to the field, an overview for those working in the field, and a reference for those searching for literature on a specific application. Methods are classified according to the different aspects of mutual-information-based registration. The main division is in aspects of the methodology and of the application. The part on methodology describes choices made on facets such as preprocessing of images, gray value interpolation, optimization, adaptations to the mutual information measure, and different types of geometrical transformations. The part on applications is a reference of the literature available on different modalities, on interpatient registration and on different anatomical objects. Comparison studies including mutual information are also considered. The paper starts with a description of entropy and mutual information and it closes with a discussion on past achievements and some future challenges.

3,010 citations

Additional excerpts

...Combining the nonparametric density estimator with an approximation of theoretical entropy has been widely described in the literature to overcome this problem [16]....

TL;DR: 1. Density estimation for exploring data 2. D density estimation for inference 3. Nonparametric regression for explore data 4. Inference with nonparametric regressors 5. Checking parametric regression models 6. Comparing regression curves and surfaces

Abstract: 1. Density estimation for exploring data 2. Density estimation for inference 3. Nonparametric regression for exploring data 4. Inference with nonparametric regression 5. Checking parametric regression models 6. Comparing regression curves and surfaces 7. Time series data 8. An introduction to semiparametric and additive models References

1,361 citations

"Cross-modal localization through mu..." refers methods in this paper

...It is generally not feasible to obtain a closed form solution to this problem without numerical methods such as Powell’s direction set method [3]....

TL;DR: A probabilistic multimodal generation model is introduced and used to derive an information theoretic measure of cross-modal correspondence and nonparametric statistical density modeling techniques can characterize the mutual information between signals from different domains.

Abstract: Audio and visual signals arriving from a common source are detected using a signal-level fusion technique. A probabilistic multimodal generation model is introduced and used to derive an information theoretic measure of cross-modal correspondence. Nonparametric statistical density modeling techniques can characterize the mutual information between signals from different domains. By comparing the mutual information between different pairs of signals, it is possible to identify which person is speaking a given utterance and discount errant motion or audio from other utterances or nonspeech events.

124 citations

"Cross-modal localization through mu..." refers methods in this paper

...The methods for mutual information (MI) estimation can be classified into two broad categories, based on whether mutual information is computed directly or the condition for maximum MI is obtained indirectly through an optimization process that does not involve computing MI [2], [7]....