A probabilistic multimodal generation model is introduced and used to derive an information theoretic measure of cross-modal correspondence and nonparametric statistical density modeling techniques can characterize the mutual information between signals from different domains.
Abstract:
Audio and visual signals arriving from a common source are detected using a signal-level fusion technique. A probabilistic multimodal generation model is introduced and used to derive an information theoretic measure of cross-modal correspondence. Nonparametric statistical density modeling techniques can characterize the mutual information between signals from different domains. By comparing the mutual information between different pairs of signals, it is possible to identify which person is speaking a given utterance and discount errant motion or audio from other utterances or nonspeech events.
TL;DR: This paper overviews the most common audio and visual speech front-end processing, transformations performed on audio, visual or joint audiovisual feature spaces and the actual measure of correspondence between audio andVisual speech.
TL;DR: In this article, a semi-coupled approach is proposed to propagate the attitude and orbital motion of objects with high area-to-mass ratios in near geostationary orbits.
TL;DR: It is shown that the localization problem can be recast as the task of clustering the audio-visual observations into coherent groups and a probabilistic generative model is proposed that captures the relations between audio and visual observations.
TL;DR: This paper proposes a simple yet effective algorithm that allows to detect and localize in real-time synchronous audio-video sources using single camera, single microphone data and obtains the best speaker localization performances reported to date on the popular CUAVE database.
TL;DR: Canonical correlation analysis is applied for characterizing yeast stress by finding what is in common in several different stress conditions by removing the restriction of normality by non-parametric estimation and formulate the problem of finding dependent components with a connection to Bayes factors.
TL;DR: The author examines the role of entropy, inequality, and randomness in the design of codes and the construction of codes in the rapidly changing environment.
TL;DR: In this paper, the problem of the estimation of a probability density function and of determining the mode of the probability function is discussed. Only estimates which are consistent and asymptotically normal are constructed.
TL;DR: The synthesis of a new category of spatial filters that produces sharp output correlation peaks with controlled peak values is considered, and these filters are referred to as minimum average correlation energy filters.
Q1. What have the authors contributed in "Speaker association with signal-level audiovisual fusion" ?
In this paper, a probabilistic multimodal generation model is introduced and used to derive an information theoretic measure of cross-modal correspondence.
Q2. What is the criterion for a prewhitening filter?
Computing can be decomposed into three stages:1) Prewhiten the images once (using the average spectrum of the images) followed by iterations of 2) Updating the feature values ( ’s) using (14), and 3) Solving for the projection coefficients using least squaresand the penalty.
Q3. How can nonparametric statistical density models be used to represent complex joint densities of projected?
Nonparametric statistical density models can be used to represent complex joint densities of projected signals, and to successfully estimate mutual information.
Q4. How can the authors learn the relationship between audio and video?
Using principles from information theory and nonparametric statistics the authors show how an approach for learning maximally informative joint subspaces can find cross-modal correspondences.
Q5. What is the adaptation criterion for the projections?
The adaptation criterion, which the authors maximize in practice, is then a combination of the approximation to MI (11) and the regularization terms:(17)where the last term derives from the output energy constraint and is average autocorrelation function (taken over all images in the sequence).
Q6. What is the way to estimate the mutual information of continuous random variables?
Mutual information for continuous random variables can be expressed in several ways as a combination of differential entropy terms [14](10)Mutual information indicates the amount of information that one random variable conveys on average about another.