scispace - formally typeset
Search or ask a question

Showing papers on "Gaussian process published in 2010"


Journal ArticleDOI
TL;DR: A probabilistic method, called the Coherent Point Drift (CPD) algorithm, is introduced for both rigid and nonrigid point set registration and a fast algorithm is introduced that reduces the method computation complexity to linear.
Abstract: Point set registration is a key component in many computer vision tasks. The goal of point set registration is to assign correspondences between two sets of points and to recover the transformation that maps one point set to the other. Multiple factors, including an unknown nonrigid spatial transformation, large dimensionality of point set, noise, and outliers, make the point set registration a challenging problem. We introduce a probabilistic method, called the Coherent Point Drift (CPD) algorithm, for both rigid and nonrigid point set registration. We consider the alignment of two point sets as a probability density estimation problem. We fit the Gaussian mixture model (GMM) centroids (representing the first point set) to the data (the second point set) by maximizing the likelihood. We force the GMM centroids to move coherently as a group to preserve the topological structure of the point sets. In the rigid case, we impose the coherence constraint by reparameterization of GMM centroid locations with rigid parameters and derive a closed form solution of the maximization step of the EM algorithm in arbitrary dimensions. In the nonrigid case, we impose the coherence constraint by regularizing the displacement field and using the variational calculus to derive the optimal transformation. We also introduce a fast algorithm that reduces the method computation complexity to linear. We test the CPD algorithm for both rigid and nonrigid transformations in the presence of noise, outliers, and missing points, where CPD shows accurate results and outperforms current state-of-the-art methods.

2,429 citations


Proceedings Article
21 Jun 2010
TL;DR: This work analyzes GP-UCB, an intuitive upper-confidence based algorithm, and bound its cumulative regret in terms of maximal information gain, establishing a novel connection between GP optimization and experimental design and obtaining explicit sublinear regret bounds for many commonly used covariance functions.
Abstract: Many applications require optimizing an unknown, noisy function that is expensive to evaluate. We formalize this task as a multi-armed bandit problem, where the payoff function is either sampled from a Gaussian process (GP) or has low RKHS norm. We resolve the important open problem of deriving regret bounds for this setting, which imply novel convergence rates for GP optimization. We analyze GP-UCB, an intuitive upper-confidence based algorithm, and bound its cumulative regret in terms of maximal information gain, establishing a novel connection between GP optimization and experimental design. Moreover, by bounding the latter in terms of operator spectra, we obtain explicit sublinear regret bounds for many commonly used covariance functions. In some important cases, our bounds have surprisingly weak dependence on the dimensionality. In our experiments on real sensor data, GP-UCB compares favorably with other heuristical GP optimization approaches.

1,876 citations


Journal ArticleDOI
TL;DR: The GPML toolbox provides a wide range of functionality for Gaussian process (GP) inference and prediction, including exact and variational inference, Expectation Propagation, and Laplace's method dealing with non-Gaussian likelihoods and FITC for dealing with large regression tasks.
Abstract: The GPML toolbox provides a wide range of functionality for Gaussian process (GP) inference and prediction. GPs are specified by mean and covariance functions; we offer a library of simple mean and covariance functions and mechanisms to compose more complex ones. Several likelihood functions are supported including Gaussian and heavy-tailed for regression as well as others suitable for classification. Finally, a range of inference methods is provided, including exact and variational inference, Expectation Propagation, and Laplace's method dealing with non-Gaussian likelihoods and FITC for dealing with large regression tasks.

924 citations


Journal ArticleDOI
TL;DR: A new kernel-based approach for linear system identification of stable systems that model the impulse response as the realization of a Gaussian process whose statistics include information not only on smoothness but also on BIBO-stability.

469 citations


Journal ArticleDOI
TL;DR: In this paper, a two-state mixture Gaussian model is used to perform asymptotically optimal Bayesian inference using belief propagation decoding, which represents the CS encoding matrix as a graphical model.
Abstract: Compressive sensing (CS) is an emerging field based on the revelation that a small collection of linear projections of a sparse signal contains enough information for stable, sub-Nyquist signal acquisition When a statistical characterization of the signal is available, Bayesian inference can complement conventional CS methods based on linear programming or greedy algorithms We perform asymptotically optimal Bayesian inference using belief propagation (BP) decoding, which represents the CS encoding matrix as a graphical model Fast computation is obtained by reducing the size of the graphical model with sparse encoding matrices To decode a length-N signal containing K large coefficients, our CS-BP decoding algorithm uses O(K log(N)) measurements and O(N log2(N)) computation Finally, although we focus on a two-state mixture Gaussian model, CS-BP is easily adapted to other signal models

468 citations


Journal ArticleDOI
TL;DR: The achievable trade-offs between predictive accuracy and computational requirements are compared, and it is shown that these are typically superior to existing state-of-the-art sparse approximations.
Abstract: We present a new sparse Gaussian Process (GP) model for regression. The key novel idea is to sparsify the spectral representation of the GP. This leads to a simple, practical algorithm for regression tasks. We compare the achievable trade-offs between predictive accuracy and computational requirements, and show that these are typically superior to existing state-of-the-art sparse approximations. We discuss both the weight space and function space representations, and note that the new construction implies priors over functions which are always stationary, and can approximate any covariance function in this class.

463 citations


Journal ArticleDOI
TL;DR: NDLP can robustly estimate an inverse system for late reverberation in the presence of noise without greatly distorting a direct speech signal and can be implemented in a computationally efficient manner in the time-frequency domain.
Abstract: This paper proposes a statistical model-based speech dereverberation approach that can cancel the late reverberation of a reverberant speech signal captured by distant microphones without prior knowledge of the room impulse responses. With this approach, the generative model of the captured signal is composed of a source process, which is assumed to be a Gaussian process with a time-varying variance, and an observation process modeled by a delayed linear prediction (DLP). The optimization objective for the dereverberation problem is derived to be the sum of the squared prediction errors normalized by the source variances; hence, this approach is referred to as variance-normalized delayed linear prediction (NDLP). Inheriting the characteristic of DLP, NDLP can robustly estimate an inverse system for late reverberation in the presence of noise without greatly distorting a direct speech signal. In addition, owing to the use of variance normalization, NDLP allows us to improve the dereverberation result especially with relatively short (of the order of a few seconds) observations. Furthermore, NDLP can be implemented in a computationally efficient manner in the time-frequency domain. Experimental results demonstrate the effectiveness and efficiency of the proposed approach in comparison with two existing approaches.

371 citations


Book ChapterDOI
01 Jan 2010
TL;DR: This work investigates a multi-points optimization criterion, the multipoints expected improvement (\(q-{\mathbb E}I\)), aimed at choosing several points at the same time, and proposes two classes of heuristic strategies meant to approximately optimize the Q-EI, and applies them to the classical Branin-Hoo test-case function.
Abstract: The optimization of expensive-to-evaluate functions generally relies on metamodel-based exploration strategies. Many deterministic global optimization algorithms used in the field of computer experiments are based on Kriging (Gaussian process regression). Starting with a spatial predictor including a measure of uncertainty, they proceed by iteratively choosing the point maximizing a criterion which is a compromise between predicted performance and uncertainty. Distributing the evaluation of such numerically expensive objective functions on many processors is an appealing idea. Here we investigate a multi-points optimization criterion, the multipoints expected improvement (\(q-{\mathbb E}I\)), aimed at choosing several points at the same time. An analytical expression of the \(q-{\mathbb E}I\) is given when q = 2, and a consistent statistical estimate is given for the general case. We then propose two classes of heuristic strategies meant to approximately optimize the \(q-{\mathbb E}I\), and apply them to the classical Branin-Hoo test-case function. It is finally demonstrated within the covered example that the latter strategies perform as good as the best Latin Hypercubes and Uniform Designs ever found by simulation (2000 designs drawn at random for every q ∈ [1,10]).

364 citations


Proceedings Article
31 Mar 2010
TL;DR: In this article, a variational inference framework for training the Gaussian process latent variable model and thus performing Bayesian nonlinear dimensionality reduction is introduced, which can automatically select the dimensionality of the nonlinear latent space.
Abstract: We introduce a variational inference framework for training the Gaussian process latent variable model and thus performing Bayesian nonlinear dimensionality reduction. This method allows us to variationally integrate out the input variables of the Gaussian process and compute a lower bound on the exact marginal likelihood of the nonlinear latent variable model. The maximization of the variational lower bound provides a Bayesian training procedure that is robust to overfitting and can automatically select the dimensionality of the nonlinear latent space. We demonstrate our method on real world datasets. The focus in this paper is on dimensionality reduction problems, but the methodology is more general. For example, our algorithm is immediately applicable for training Gaussian process models in the presence of missing or uncertain inputs.

338 citations


Journal ArticleDOI
TL;DR: Twin Gaussian processes (TGP), a generic structured prediction method that uses Gaussian process priors on both covariates and responses, both multivariate, and estimates outputs by minimizing the Kullback-Leibler divergence between two GP modeled as normal distributions over finite index sets of training and testing examples, is described.
Abstract: We describe twin Gaussian processes (TGP), a generic structured prediction method that uses Gaussian process (GP) priors on both covariates and responses, both multivariate, and estimates outputs by minimizing the Kullback-Leibler divergence between two GP modeled as normal distributions over finite index sets of training and testing examples, emphasizing the goal that similar inputs should produce similar percepts and this should hold, on average, between their marginal distributions. TGP captures not only the interdependencies between covariates, as in a typical GP, but also those between responses, so correlations among both inputs and outputs are accounted for. TGP is exemplified, with promising results, for the reconstruction of 3d human poses from monocular and multicamera video sequences in the recently introduced HumanEva benchmark, where we achieve 5 cm error on average per 3d marker for models trained jointly, using data from multiple people and multiple activities. The method is fast and automatic: it requires no hand-crafting of the initial pose, camera calibration parameters, or the availability of a 3d body model associated with human subjects used for training or testing.

303 citations


Journal ArticleDOI
TL;DR: In this article, a probabilistic approach for statistical modeling of the loads in distribution networks is presented, where the probability density functions (pdfs) of loads at different buses show a number of variations and cannot be represented by any specific distribution.
Abstract: This paper presents a probabilistic approach for statistical modeling of the loads in distribution networks. In a distribution network, the probability density functions (pdfs) of loads at different buses show a number of variations and cannot be represented by any specific distribution. The approach presented in this paper represents all the load pdfs through Gaussian mixture model (GMM). The expectation maximization (EM) algorithm is used to obtain the parameters of the mixture components. The performance of the method is demonstrated on a 95-bus generic distribution network model.

Journal ArticleDOI
TL;DR: Both the subject-to-subject heterogeneity and covariate information can be incorporated into the model in a natural way and the bootstrap is used to assess the variability of the maximum likelihood estimators.
Abstract: This paper studies the maximum likelihood estimation of a class of inverse Gaussian process models for degradation data Both the subject-to-subject heterogeneity and covariate information can be incorporated into the model in a natural way The EM algorithm is used to obtain the maximum likelihood estimators of the unknown parameters and the bootstrap is used to assess the variability of the maximum likelihood estimators Simulations are used to validate the method The model is fitted to laser data and corresponding goodness-of-fit tests are carried out Failure time distributions in terms of degradation level passages are calculated and illustrated The supplemental materials for this article are available online

Journal ArticleDOI
TL;DR: This work describes an extension of BP to continuous variable models, generalizing particle filtering, and Gaussian mixture filtering techniques for time series to more complex models and illustrates the power of the resulting nonparametric BP algorithm via two applications: kinematic tracking of visual motion and distributed localization in sensor networks.
Abstract: Continuous quantities are ubiquitous in models of real-world phenomena, but are surprisingly difficult to reason about automatically. Probabilistic graphical models such as Bayesian networks and Markov random fields, and algorithms for approximate inference such as belief propagation (BP), have proven to be powerful tools in a wide range of applications in statistics and artificial intelligence. However, applying these methods to models with continuous variables remains a challenging task. In this work we describe an extension of BP to continuous variable models, generalizing particle filtering, and Gaussian mixture filtering techniques for time series to more complex models. We illustrate the power of the resulting nonparametric BP algorithm via two applications: kinematic tracking of visual motion and distributed localization in sensor networks.

Proceedings ArticleDOI
07 Oct 2010
TL;DR: This paper shows how temporal Gaussian process regression models in machine learning can be reformulated as linear-Gaussian state space models, which can be solved exactly with classical Kalman filtering theory, and produces an efficient non-parametric learning algorithm.
Abstract: In this paper, we show how temporal (i.e., time-series) Gaussian process regression models in machine learning can be reformulated as linear-Gaussian state space models, which can be solved exactly with classical Kalman filtering theory. The result is an efficient non-parametric learning algorithm, whose computational complexity grows linearly with respect to number of observations. We show how the reformulation can be done for Matern family of covariance functions analytically and for squared exponential covariance function by applying spectral Taylor series approximation. Advantages of the proposed approach are illustrated with two numerical experiments.

Journal ArticleDOI
TL;DR: A generalization of the mercury/waterfilling algorithm, previously proposed for parallel noninterfering channels, is put forth, in which the mercury level accounts not only for the non-Gaussian input distributions, but also for the interference among inputs.
Abstract: In this paper, we investigate the linear precoding and power allocation policies that maximize the mutual information for general multiple-input-multiple-output (MIMO) Gaussian channels with arbitrary input distributions, by capitalizing on the relationship between mutual information and minimum mean-square error (MMSE). The optimal linear precoder satisfies a fixed-point equation as a function of the channel and the input constellation. For non-Gaussian inputs, a nondiagonal precoding matrix in general increases the information transmission rate, even for parallel noninteracting channels. Whenever precoding is precluded, the optimal power allocation policy also satisfies a fixed-point equation; we put forth a generalization of the mercury/waterfilling algorithm, previously proposed for parallel noninterfering channels, in which the mercury level accounts not only for the non-Gaussian input distributions, but also for the interference among inputs.

Journal ArticleDOI
TL;DR: In this article, the authors studied the number of measurements required to recover a sparse signal in CM with L nonzero coefficients from compressed samples in the presence of noise, and proved that O(L) is sufficient and sufficient for signal recovery, whenever L grows linearly as a function of M. In contrast, the implementation of their proof method would have a higher complexity.
Abstract: In this paper, we study the number of measurements required to recover a sparse signal in CM with L nonzero coefficients from compressed samples in the presence of noise. We consider a number of different recovery criteria, including the exact recovery of the support of the signal, which was previously considered in the literature, as well as new criteria for the recovery of a large fraction of the support of the signal, and the recovery of a large fraction of the energy of the signal. For these recovery criteria, we prove that O(L) (an asymptotically linear multiple of L) measurements are necessary and sufficient for signal recovery, whenever L grows linearly as a function of M. This improves on the existing literature that is mostly focused on variants of a specific recovery algorithm based on convex programming, for which O(L log(M - L)) measurements are required. In contrast, the implementation of our proof method would have a higher complexity. We also show that O(L log(M - L)) measurements are required in the sublinear regime (L - o(M)). For our sufficiency proofs, we introduce a Shannon-theoretic decoder based on joint typicality, which allows error events to be defined in terms of a single random variable in contrast to previous information-theoretic work, where comparison of random variables are required. We also prove concentration results for our error bounds implying that a randomly selected Gaussian matrix will suffice with high probability. For our necessity proofs, we rely on results from channel coding and rate-distortion theory.

Proceedings ArticleDOI
13 Jun 2010
TL;DR: This paper proposes a novel approach to age estimation by formulating the problem as a multi-task learning problem called multi- task warped Gaussian process (MTWGP), and shows that MTWGP compares favorably with state-of-the-art age estimation methods.
Abstract: Automatic age estimation from facial images has aroused research interests in recent years due to its promising potential for some computer vision applications. Among the methods proposed to date, personalized age estimation methods generally outperform global age estimation methods by learning a separate age estimator for each person in the training data set. However, since typical age databases only contain very limited training data for each person, training a separate age estimator using only training data for that person runs a high risk of overfitting the data and hence the prediction performance is limited. In this paper, we propose a novel approach to age estimation by formulating the problem as a multi-task learning problem. Based on a variant of the Gaussian process (GP) called warped Gaussian process (WGP), we propose a multi-task extension called multi-task warped Gaussian process (MTWGP). Age estimation is formulated as a multi-task regression problem in which each learning task refers to estimation of the age function for each person. While MTWGP models common features shared by different tasks (persons), it also allows task-specific (person-specific) features to be learned automatically. Moreover, unlike previous age estimation methods which need to specify the form of the regression functions or determine many parameters in the functions using inefficient methods such as cross validation, the form of the regression functions in MTWGP is implicitly defined by the kernel function and all its model parameters can be learned from data automatically. We have conducted experiments on two publicly available age databases, FG-NET and MORPH. The experimental results are very promising in showing that MTWGP compares favorably with state-of-the-art age estimation methods.

01 Oct 2010
TL;DR: In this article, the authors consider the problem of fitting a parametric model to time-series data that are afflicted by correlated noise, represented by a sum of two stationary Gaussian processes: one that is uncorrelated in time and another that has a power spectral density varying as 1/f γ.
Abstract: We consider the problem of fitting a parametric model to time-series data that are afflicted by correlated noise. The noise is represented by a sum of two stationary Gaussian processes: one that is uncorrelated in time, and another that has a power spectral density varying as 1/f γ. We present an accurate and fast [O(N)] algorithm for parameter estimation based on computing the likelihood in a wavelet basis. The method is illustrated and tested using simulated time-series photometry of exoplanetary transits, with particular attention to estimating the mid-transit time. We compare our method to two other methods that have been used in the literature, the time-averaging method and the residual-permutation method. For noise processes that obey our assumptions, the algorithm presented here gives more accurate results for mid-transit times and truer estimates of their uncertainties.

Journal ArticleDOI
Emmanuel Vazquez1, Julien Bect1
TL;DR: The first result is that under some mild hypotheses on the covariance function k of the Gaussian process, the expected improvement algorithm produces a dense sequence of evaluation points in the search domain, when the function to be optimized is in the reproducing kernel Hilbert space generated by k.

Proceedings Article
06 Dec 2010
TL;DR: In this article, a slice sampling approach is presented that requires little tuning while mixing well in both strong and weak data regimes, in which the covariance structure can be specified using unknown hyperparameters.
Abstract: The Gaussian process (GP) is a popular way to specify dependencies between random variables in a probabilistic model. In the Bayesian framework the covariance structure can be specified using unknown hyperparameters. Integrating over these hyperparameters considers different possible explanations for the data when making predictions. This integration is often performed using Markov chain Monte Carlo (MCMC) sampling. However, with non-Gaussian observations standard hyperparameter sampling approaches require careful tuning and may converge slowly. In this paper we present a slice sampling approach that requires little tuning while mixing well in both strong- and weak-data regimes.

Journal ArticleDOI
TL;DR: This work shows that with an appropriate combination of kernels a significant boost in classification performance is possible, and indicates the utility of active learning with probabilistic predictive models, especially when the amount of training data labels that may be sought for a category is ultimately very small.
Abstract: Discriminative methods for visual object category recognition are typically non-probabilistic, predicting class labels but not directly providing an estimate of uncertainty. Gaussian Processes (GPs) provide a framework for deriving regression techniques with explicit uncertainty models; we show here how Gaussian Processes with covariance functions defined based on a Pyramid Match Kernel (PMK) can be used for probabilistic object category recognition. Our probabilistic formulation provides a principled way to learn hyperparameters, which we utilize to learn an optimal combination of multiple covariance functions. It also offers confidence estimates at test points, and naturally allows for an active learning paradigm in which points are optimally selected for interactive labeling. We show that with an appropriate combination of kernels a significant boost in classification performance is possible. Further, our experiments indicate the utility of active learning with probabilistic predictive models, especially when the amount of training data labels that may be sought for a category is ultimately very small.

Book
22 Nov 2010
TL;DR: First, PILCO, a fully Bayesian approach for efficient RL in continuous-valued state and action spaces when no expert knowledge is available is introduced, and principled algorithms for robust filtering and smoothing in GP dynamic systems are proposed.
Abstract: This book examines Gaussian processes in both model-based reinforcement learning (RL) and inference in nonlinear dynamic systems. First, we introduce PILCO, a fully Bayesian approach for efficient RL in continuous-valued state and action spaces when no expert knowledge is available. PILCO takes model uncertainties consistently into account during long-term planning to reduce model bias. Second, we propose principled algorithms for robust filtering and smoothing in GP dynamic systems. Umfang: IX, 205 S. Preis: €36.00 | £33.00 | $63.00

Proceedings ArticleDOI
14 Mar 2010
TL;DR: An acoustic modeling approach in which all phonetic states share a common Gaussian Mixture Model structure, and the means and mixture weights vary in a subspace of the total parameter space, and this style of acoustic model allows for a much more compact representation.
Abstract: We describe an acoustic modeling approach in which all phonetic states share a common Gaussian Mixture Model structure, and the means and mixture weights vary in a subspace of the total parameter space. We call this a Subspace Gaussian Mixture Model (SGMM). Globally shared parameters define the subspace. This style of acoustic model allows for a much more compact representation and gives better results than a conventional modeling approach, particularly with smaller amounts of training data.

Proceedings ArticleDOI
14 Mar 2010
TL;DR: This work reports experiments on a different approach to multilingual speech recognition, in which the phone sets are entirely distinct but the model has parameters not tied to specific states that are shared across languages.
Abstract: Although research has previously been done on multilingual speech recognition, it has been found to be very difficult to improve over separately trained systems. The usual approach has been to use some kind of “universal phone set” that covers multiple languages. We report experiments on a different approach to multilingual speech recognition, in which the phone sets are entirely distinct but the model has parameters not tied to specific states that are shared across languages. We use a model called a “Subspace Gaussian Mixture Model” where states' distributions are Gaussian Mixture Models with a common structure, constrained to lie in a subspace of the total parameter space. The parameters that define this subspace can be shared across languages. We obtain substantial WER improvements with this approach, especially with very small amounts of in-language training data.

Journal ArticleDOI
29 Apr 2010
TL;DR: In this article, the authors compare and contrast from a geometric perspective a number of low-dimensional signal models that support stable information-preserving dimensionality reduction, including sparse and compressible signal models for deterministic and random signals.
Abstract: We compare and contrast from a geometric perspective a number of low-dimensional signal models that support stable information-preserving dimensionality reduction. We consider sparse and compressible signal models for deterministic and random signals, structured sparse and compressible signal models, point clouds, and manifold signal models. Each model has a particular geometrical structure that enables signal information to be stably preserved via a simple linear and nonadaptive projection to a much lower dimensional space; in each case the projection dimension is independent of the signal's ambient dimension at best or grows logarithmically with it at worst. As a bonus, we point out a common misconception related to probabilistic compressible signal models, namely, by showing that the oft-used generalized Gaussian and Laplacian models do not support stable linear dimensionality reduction.

Proceedings ArticleDOI
26 Jul 2010
TL;DR: The main contribution of this paper is the implementation of a Probability Hypothesis Density filter for tracking of multiple extended targets, and a method to easily partition the measurements into a number of subsets that all stem from the same source.
Abstract: In extended target tracking, targets potentially produce more than one measurement per time step. Multiple extended targets are therefore usually hard to track, due to the resulting complex data association. The main contribution of this paper is the implementation of a Probability Hypothesis Density (PHD) filter for tracking of multiple extended targets. A general modification of the PHD filter to handle extended targets has been presented recently by Mahler, and the novelty in this work lies in the realisation of a Gaussian mixture PHD filter for extended targets. Furthermore, we propose a method to easily partition the measurements into a number of subsets, each of which is supposed to contain measurements that all stem from the same source. The method is illustrated in simulation examples, and the advantage of the implemented extended target PHD filter is shown in a comparison with a standard PHD filter.

Journal ArticleDOI
TL;DR: In this article, the load probability density function (pdf) in the distribution network shows a number of variations at different nodes and cannot be represented by any specific distribution, and an approach to utilise the loads as pseudo-measurements for the purpose of distribution system state estimation (DSSE).
Abstract: This study presents an approach to utilise the loads as pseudo-measurements for the purpose of distribution system state estimation (DSSE). The load probability density function (pdf) in the distribution network shows a number of variations at different nodes and cannot be represented by any specific distribution. The approach presented in this study represents all the load pdfs through the Gaussian mixture model (GMM). The expectation maximisation (EM) algorithm is used to obtain the parameters of the mixture components. The standard weighted least squares (WLS) algorithm utilises these load models as pseudo-measurements. The effectiveness of WLS is assessed through some statistical measures such as bias, consistency and quality of the estimates in a 95-bus generic distribution network model.

Journal ArticleDOI
TL;DR: This paper considers recursive tracking of one mobile emitter using a sequence of time difference of arrival (TDOA) and frequency difference of arriving measurement pairs obtained by one pair of sensors, which results in a better track state probability density function approximation by a Gaussian mixture, and tracking results near the Cramer-Rao lower bound.
Abstract: This paper considers recursive tracking of one mobile emitter using a sequence of time difference of arrival (TDOA) and frequency difference of arrival (FDOA) measurement pairs obtained by one pair of sensors. We consider only a single emitter without data association issues (no missed detections or false measurements). Each TDOA measurement defines a region of possible emitter locations around a unique hyperbola. This likelihood function is approximated by a Gaussian mixture, which leads to a dynamic bank of Kalman filters tracking algorithm. The FDOA measurements update relative probabilities and estimates of individual Kalman filters. This approach results in a better track state probability density function approximation by a Gaussian mixture, and tracking results near the Cramer-Rao lower bound. Proposed algorithm is also applicable in other cases of nonlinear information fusion. The performance of proposed Gaussian mixture approach is evaluated using a simulation study, and compared with a bank of EKF filters and the Cramer-Rao lower bound.

Journal ArticleDOI
TL;DR: This result shows that the celebrated Schalkwijk-Kailath coding achieves the feedback capacity for the first-order autoregressive moving-average Gaussian channel, positively answering a long-standing open problem studied by Butman, Tiernan-SchalkWijk, Wolfowitz, Ozarow, Ordentlich, Yang-Kavc¿ic¿-Tatikonda, and others.
Abstract: The feedback capacity of additive stationary Gaussian noise channels is characterized as the solution to a variational problem in the noise power spectral density. When specialized to the first-order autoregressive moving-average noise spectrum, this variational characterization yields a closed-form expression for the feedback capacity. In particular, this result shows that the celebrated Schalkwijk-Kailath coding achieves the feedback capacity for the first-order autoregressive moving-average Gaussian channel, positively answering a long-standing open problem studied by Butman, Tiernan-Schalkwijk, Wolfowitz, Ozarow, Ordentlich, Yang-Kavc?ic?-Tatikonda, and others. More generally, it is shown that a k-dimensional generalization of the Schalkwijk-Kailath coding achieves the feedback capacity for any autoregressive moving-average noise spectrum of order k. Simply put, the optimal transmitter iteratively refines the receiver's knowledge of the intended message. This development reveals intriguing connections between estimation, control, and feedback communication.

Proceedings ArticleDOI
12 Apr 2010
TL;DR: A natural metric is introduced between sets of sensors that can be used to construct covariance functions over sets, and thereby perform Gaussian process inference over a function whose domain is a power set.
Abstract: We consider the problem of selecting an optimal set of sensors, as determined, for example, by the predictive accuracy of the resulting sensor network. Given an underlying metric between pairs of set elements, we introduce a natural metric between sets of sensors for this task. Using this metric, we can construct covariance functions over sets, and thereby perform Gaussian process inference over a function whose domain is a power set. If the function has additional inputs, our covariances can be readily extended to incorporate them---allowing us to consider, for example, functions over both sets and time. These functions can then be optimized using Gaussian process global optimization (GPGO). We use the root mean squared error (RMSE) of the predictions made using a set of sensors at a particular time as an example of such a function to be optimized; the optimal point specifies the best choice of sensor locations. We demonstrate the resulting method by dynamically selecting the best subset of a given set of weather sensors for the prediction of the air temperature across the United Kingdom.