Showing papers on "Gaussian process published in 2011"

PDF

Open Access

Journal Article•DOI•

Robust Point Set Registration Using Gaussian Mixture Models

[...]

Bing Jian¹, Baba C. Vemuri²•Institutions (2)

01 Aug 2011-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This paper presents a unified framework for the rigid and nonrigid point set registration problem in the presence of significant amounts of noise and outliers, and shows that the popular iterative closest point (ICP) method and several existing point setRegistration methods in the field are closely related and can be reinterpreted meaningfully in this general framework.

...read moreread less

Abstract: In this paper, we present a unified framework for the rigid and nonrigid point set registration problem in the presence of significant amounts of noise and outliers. The key idea of this registration framework is to represent the input point sets using Gaussian mixture models. Then, the problem of point set registration is reformulated as the problem of aligning two Gaussian mixtures such that a statistical discrepancy measure between the two corresponding mixtures is minimized. We show that the popular iterative closest point (ICP) method and several existing point set registration methods in the field are closely related and can be reinterpreted meaningfully in our general framework. Our instantiation of this general framework is based on the the L2 distance between two Gaussian mixtures, which has the closed-form expression and in turn leads to a computationally efficient registration algorithm. The resulting registration algorithm exhibits inherent statistical robustness, has an intuitive interpretation, and is simple to implement. We also provide theoretical and experimental comparisons with other robust methods for point set registration.

...read moreread less

909 citations

Posted Content•

Bayesian Active Learning for Classification and Preference Learning

[...]

Neil Houlsby, Ferenc Huszar, Zoubin Ghahramani, Máté Lengyel

24 Dec 2011-arXiv: Machine Learning

TL;DR: This work proposes an approach that expresses information gain in terms of predictive entropies, and applies this method to the Gaussian Process Classier (GPC), and makes minimal approximations to the full information theoretic objective.

...read moreread less

Abstract: Information theoretic active learning has been widely studied for probabilistic models. For simple regression an optimal myopic policy is easily tractable. However, for other tasks and with more complex models, such as classication with nonparametric models, the optimal solution is harder to compute. Current approaches make approximations to achieve tractability. We propose an approach that expresses information gain in terms of predictive entropies, and apply this method to the Gaussian Process Classier (GPC). Our approach makes minimal approximations to the full information theoretic objective. Our experimental performance compares favourably to many popular active learning algorithms, and has equal or lower computational complexity. We compare well to decision theoretic approaches also, which are privy to more information and require much more computational time. Secondly, by developing further a reformulation of binary preference learning to a classication problem, we extend our algorithm to Gaussian Process preference learning.

...read moreread less

578 citations

Proceedings Article•

Contextual Gaussian Process Bandit Optimization

[...]

Andreas Krause¹, Cheng Soon Ong¹•Institutions (1)

ETH Zurich¹

12 Dec 2011

TL;DR: This work model the payoff function as a sample from a Gaussian process defined over the joint context-action space, and develops CGP-UCB, an intuitive upper-confidence style algorithm that shows that context-sensitive optimization outperforms no or naive use of context.

...read moreread less

Abstract: How should we design experiments to maximize performance of a complex system, taking into account uncontrollable environmental conditions? How should we select relevant documents (ads) to display, given information about the user? These tasks can be formalized as contextual bandit problems, where at each round, we receive context (about the experimental conditions, the query), and have to choose an action (parameters, documents). The key challenge is to trade off exploration by gathering data for estimating the mean payoff function over the context-action space, and to exploit by choosing an action deemed optimal based on the gathered data. We model the payoff function as a sample from a Gaussian process defined over the joint context-action space, and develop CGP-UCB, an intuitive upper-confidence style algorithm. We show that by mixing and matching kernels for contexts and actions, CGP-UCB can handle a variety of practical applications. We further provide generic tools for deriving regret bounds when using such composite kernel functions. Lastly, we evaluate our algorithm on two case studies, in the context of automated vaccine design and sensor management. We show that context-sensitive optimization outperforms no or naive use of context.

...read moreread less

371 citations

Proceedings Article•

Nonlinear Inverse Reinforcement Learning with Gaussian Processes

[...]

Sergey Levine¹, Zoran Popović², Vladlen Koltun¹•Institutions (2)

Stanford University¹, University of Washington²

12 Dec 2011

TL;DR: A probabilistic algorithm that allows complex behaviors to be captured from suboptimal stochastic demonstrations, while automatically balancing the simplicity of the learned reward structure against its consistency with the observed actions.

...read moreread less

Abstract: We present a probabilistic algorithm for nonlinear inverse reinforcement learning. The goal of inverse reinforcement learning is to learn the reward function in a Markov decision process from expert demonstrations. While most prior inverse reinforcement learning algorithms represent the reward as a linear combination of a set of features, we use Gaussian processes to learn the reward as a nonlinear function, while also determining the relevance of each feature to the expert's policy. Our probabilistic algorithm allows complex behaviors to be captured from suboptimal stochastic demonstrations, while automatically balancing the simplicity of the learned reward structure against its consistency with the observed actions.

...read moreread less

336 citations

Proceedings Article•DOI•

Single image super-resolution using Gaussian process regression

[...]

He He¹, Wan-Chi Siu¹•Institutions (1)

Hong Kong Polytechnic University¹

20 Jun 2011

TL;DR: This paper proposes a framework for both magnification and deblurring using only the original low-resolution image and its blurred version, and shows that when using a proper covariance function, the Gaussian process regression can perform soft clustering of pixels based on their local structures.

...read moreread less

Abstract: In this paper we address the problem of producing a high-resolution image from a single low-resolution image without any external training set. We propose a framework for both magnification and deblurring using only the original low-resolution image and its blurred version. In our method, each pixel is predicted by its neighbors through the Gaussian process regression. We show that when using a proper covariance function, the Gaussian process regression can perform soft clustering of pixels based on their local structures. We further demonstrate that our algorithm can extract adequate information contained in a single low-resolution image to generate a high-resolution image with sharp edges, which is comparable to or even superior in quality to the performance of other edge-directed and example-based super-resolution algorithms. Experimental results also show that our approach maintains high-quality performance at large magnifications.

...read moreread less

292 citations

Journal Article•DOI•

Computationally Efficient Convolved Multiple Output Gaussian Processes

[...]

Mauricio A. Álvarez¹, Neil D. Lawrence²•Institutions (2)

Technological University of Pereira¹, University of Sheffield²

01 Feb 2011-Journal of Machine Learning Research

TL;DR: This paper presents different efficient approximations for dependent output Gaussian processes constructed through the convolution formalism, exploit the conditional independencies present naturally in the model and shows experimental results with synthetic and real data.

...read moreread less

Abstract: Recently there has been an increasing interest in regression methods that deal with multiple outputs. This has been motivated partly by frameworks like multitask learning, multisensor networks or structured output data. From a Gaussian processes perspective, the problem reduces to specifying an appropriate covariance function that, whilst being positive semi-definite, captures the dependencies between all the data points and across all the outputs. One approach to account for non-trivial correlations between outputs employs convolution processes. Under a latent function interpretation of the convolution transform we establish dependencies between output variables. The main drawbacks of this approach are the associated computational and storage demands. In this paper we address these issues. We present different efficient approximations for dependent output Gaussian processes constructed through the convolution formalism. We exploit the conditional independencies present naturally in the model. This leads to a form of the covariance similar in spirit to the so called PITC and FITC approximations for a single output. We show experimental results with synthetic and real data, in particular, we show results in school exams score prediction, pollution prediction and gene expression data.

...read moreread less

274 citations

Journal Article•DOI•

Estimation in Gaussian Noise: Properties of the Minimum Mean-Square Error

[...]

Dongning Guo¹, Yihong Wu², Shlomo Shamai³, Sergio Verdu²•Institutions (3)

Northwestern University¹, Princeton University², Technion – Israel Institute of Technology³

01 Apr 2011-IEEE Transactions on Information Theory

TL;DR: It is shown that the minimum mean-square error (MMSE) of estimating an arbitrary random variable from its observation contaminated by Gaussian noise is found to be infinitely differentiable at all positive SNR, and in fact a real analytic function in SNR under mild conditions.

...read moreread less

Abstract: Consider the minimum mean-square error (MMSE) of estimating an arbitrary random variable from its observation contaminated by Gaussian noise. The MMSE can be regarded as a function of the signal-to-noise ratio (SNR) as well as a functional of the input distribution (of the random variable to be estimated). It is shown that the MMSE is concave in the input distribution at any given SNR. For a given input distribution, the MMSE is found to be infinitely differentiable at all positive SNR, and in fact a real analytic function in SNR under mild conditions. The key to these regularity results is that the posterior distribution conditioned on the observation through Gaussian channels always decays at least as quickly as some Gaussian density. Furthermore, simple expressions for the first three derivatives of the MMSE with respect to the SNR are obtained. It is also shown that, as functions of the SNR, the curves for the MMSE of a Gaussian input and that of a non-Gaussian input cross at most once over all SNRs. These properties lead to simple proofs of the facts that Gaussian inputs achieve both the secrecy capacity of scalar Gaussian wiretap channels and the capacity of scalar Gaussian broadcast channels, as well as a simple proof of the entropy power inequality in the special case where one of the variables is Gaussian.

...read moreread less

273 citations

Proceedings Article•

Additive Gaussian Processes

[...]

David Duvenaud¹, Hannes Nickisch², Carl Edward Rasmussen¹•Institutions (2)

University of Cambridge¹, Max Planck Society²

12 Dec 2011

TL;DR: In this article, a Gaussian process model of functions which are additive is introduced, where an additive function decomposes into a sum of low-dimensional functions, each depending on only a subset of the input variables.

...read moreread less

Abstract: We introduce a Gaussian process model of functions which are additive. An additive function is one which decomposes into a sum of low-dimensional functions, each depending on only a subset of the input variables. Additive GPs generalize both Generalized Additive Models, and the standard GP models which use squared-exponential kernels. Hyperparameter learning in this model can be seen as Bayesian Hierarchical Kernel Learning (HKL). We introduce an expressive but tractable parameterization of the kernel function, which allows efficient evaluation of all input interaction terms, whose number is exponential in the input dimension. The additional structure discoverable by this model results in increased interpretability, as well as state-of-the-art predictive power in regression tasks.

...read moreread less

268 citations

Book•

Gaussian Process Regression Analysis for Functional Data

[...]

Jian Qing Shi, Taeryon Choi

01 Jul 2011

TL;DR: This chapter discusses Gaussian Process Regression models, and some examples of Covariance Function and Model Selection, as well as some of the models used in this research.

...read moreread less

Abstract: Introduction Functional Regression Models Gaussian Process Regression Some Data Sets and Associated Statistical Problems Bayesian Nonlinear Regression with Gaussian Process Priors Gaussian Process Prior and Posterior Posterior Consistency Asymptotic Properties of the Gaussian Process Regression Models Inference and Computation for Gaussian Process Regression Model Empirical Bayes Estimates Bayesian Inference and MCMC Numerical Computation Covariance Function and Model Selection Examples of Covariance Functions Selection of Covariance Functions Variable Selection Functional Regression Analysis Linear Functional Regression Model Gaussian Process Functional Regression Model GPFR Model with a Linear Functional Mean Model Mixed-Effects GPFR Models GPFR ANOVA Model Mixture Models and Curve Clustering Mixture GPR Models Mixtures of GPFR Models Curve Clustering Generalized Gaussian Process Regression for Non-Gaussian Functional Data Gaussian Process Binary Regression Model Generalized Gaussian Process Regression Generalized GPFR Model for Batch Data Mixture Models for Multinomial Batch Data Some Other Related Models Multivariate Gaussian Process Regression Model Gaussian Process Latent Variable Models Optimal Dynamic Control Using GPR Model RKHS and Gaussian Process Regression Appendices Bibliography Index Further Reading and Notes appear at the end of each chapter.

...read moreread less

267 citations

Journal Article•DOI•

Strictly and non-strictly positive definite functions on spheres

[...]

Tilmann Gneiting

30 Nov 2011-arXiv: Probability

TL;DR: In this paper, characterizations of positive definite functions on spheres in terms of Gegenbauer expansions are reviewed and applied to dimension walks, where monotonicity properties of the Gegenstein coefficients guarantee positive definiteness in higher dimensions.

...read moreread less

Abstract: Isotropic positive definite functions on spheres play important roles in spatial statistics, where they occur as the correlation functions of homogeneous random fields and star-shaped random particles. In approximation theory, strictly positive definite functions serve as radial basis functions for interpolating scattered data on spherical domains. We review characterizations of positive definite functions on spheres in terms of Gegenbauer expansions and apply them to dimension walks, where monotonicity properties of the Gegenbauer coefficients guarantee positive definiteness in higher dimensions. Subject to a natural support condition, isotropic positive definite functions on the Euclidean space $\mathbb{R}^3$, such as Askey's and Wendland's functions, allow for the direct substitution of the Euclidean distance by the great circle distance on a one-, two- or three-dimensional sphere, as opposed to the traditional approach, where the distances are transformed into each other. Completely monotone functions are positive definite on spheres of any dimension and provide rich parametric classes of such functions, including members of the powered exponential, Matern, generalized Cauchy and Dagum families. The sine power family permits a continuous parameterization of the roughness of the sample paths of a Gaussian process. A collection of research problems provides challenges for future work in mathematical analysis, probability theory and spatial statistics.

...read moreread less

257 citations

Journal Article•DOI•

Dequantizing Compressed Sensing: When Oversampling and Non-Gaussian Constraints Combine

[...]

Laurent Jacques¹, David K. Hammond², Jalal M. Fadili³•Institutions (3)

Université catholique de Louvain¹, University of Oregon², Centre national de la recherche scientifique³

01 Jan 2011-IEEE Transactions on Information Theory

TL;DR: Basis Pursuit DeQuantizer of moment p (BPDQp) as discussed by the authors is a new convex optimization program for recovering sparse or compressible signals from uniformly quantized measurements, which minimizes the sparsity of the signal to be reconstructed subject to a data fidelity constraint expressed in the lp-norm of the residual error for 2 ≤ p ≤ ∞.

...read moreread less

Abstract: In this paper, we study the problem of recovering sparse or compressible signals from uniformly quantized measurements. We present a new class of convex optimization programs, or decoders, coined Basis Pursuit DeQuantizer of moment p (BPDQp), that model the quantization distortion more faithfully than the commonly used Basis Pursuit DeNoise (BPDN) program. Our decoders proceed by minimizing the sparsity of the signal to be reconstructed subject to a data-fidelity constraint expressed in the lp-norm of the residual error for 2 ≤ p ≤ ∞. We show theoretically that, (i) the reconstruction error of these new decoders is bounded if the sensing matrix satisfies an extended Restricted Isometry Property involving the Iρ norm, and (ii), for Gaussian random matrices and uniformly quantized measurements, BPDQp performance exceeds that of BPDN by dividing the reconstruction error due to quantization by √(p + 1). This last effect happens with high probability when the number of measurements exceeds a value growing with p, i.e., in an oversampled situation compared to what is commonly required by BPDN = BPDQ2. To demonstrate the theoretical power of BPDQp, we report numerical simulations on signal and image reconstruction problems.

...read moreread less

Proceedings Article•

Variational Heteroscedastic Gaussian Process Regression

[...]

Michalis K. Titsias¹, Miguel L zaro-gredilla²•Institutions (2)

University of Manchester¹, University of Cantabria²

28 Jun 2011

TL;DR: This work presents a non-standard variational approximation that allows accurate inference in heteroscedastic GPs (i.e., under input-dependent noise conditions) and its effectiveness is illustrated on several synthetic and real datasets of diverse characteristics.

...read moreread less

Abstract: Standard Gaussian processes (GPs) model observations' noise as constant throughout input space. This is often a too restrictive assumption, but one that is needed for GP inference to be tractable. In this work we present a non-standard variational approximation that allows accurate inference in heteroscedastic GPs (i.e., under input-dependent noise conditions). Computational cost is roughly twice that of the standard GP, and also scales as O(n3). Accuracy is verified by comparing with the golden standard MCMC and its effectiveness is illustrated on several synthetic and real datasets of diverse characteristics. An application to volatility forecasting is also considered.

...read moreread less

Proceedings Article•

Portfolio allocation for Bayesian optimization

[...]

Matthew W. Hoffman¹, Eric Brochu¹, Nando de Freitas¹•Institutions (1)

University of British Columbia¹

14 Jul 2011

TL;DR: In this article, the authors adopt a portfolio of acquisition functions governed by an online multi-armed bandit strategy, and propose several portfolio strategies, the best of which is called GP-Hedge, and show that this method outperforms the best individual acquisition function.

...read moreread less

Abstract: Bayesian optimization with Gaussian processes has become an increasingly popular tool in the machine learning community. It is efficient and can be used when very little is known about the objective function, making it popular in expensive black-box optimization scenarios. It uses Bayesian methods to sample the objective efficiently using an acquisition function which incorporates the posterior estimate of the objective. However, there are several different parameterized acquisition functions in the literature, and it is often unclear which one to use. Instead of using a single acquisition function, we adopt a portfolio of acquisition functions governed by an online multi-armed bandit strategy. We propose several portfolio strategies, the best of which we call GP-Hedge, and show that this method outperforms the best individual acquisition function. We also provide a theoretical bound on the algorithm's performance.

...read moreread less

Proceedings Article•DOI•

Efficient, generalized indoor WiFi GraphSLAM

[...]

Joseph Huang, David Millman, Morgan Quigley¹, David Stavens¹, Sebastian Thrun¹, Alok Aggarwal - Show less +2 more•Institutions (1)

Stanford University¹

09 May 2011

TL;DR: This work presents a GraphSLAM-like algorithm for signal strength SLAM, which shares many of the benefits of Gaussian processes, yet is viable for a broader range of environments since it makes no signature uniqueness assumptions.

...read moreread less

Abstract: The widespread deployment of wireless networks presents an opportunity for localization and mapping using only signal-strength measurements. The current state of the art is to use Gaussian process latent variable models (GP-LVM). This method works well, but relies on a signature uniqueness assumption which limits its applicability to only signal-rich environments. Moreover, it does not scale computationally to large sets of data, requiring O (N3) operations per iteration. We present a GraphSLAM-like algorithm for signal strength SLAM. Our algorithm shares many of the benefits of Gaussian processes, yet is viable for a broader range of environments since it makes no signature uniqueness assumptions. It is also more tractable to larger map sizes, requiring O (N2) operations per iteration. We compare our algorithm to a laser-SLAM ground truth, showing it produces excellent results in practice.

...read moreread less

Journal Article•DOI•

Prediction error identification of linear systems: A nonparametric Gaussian regression approach

[...]

Gianluigi Pillonetto¹, Alessandro Chiuso¹, Giuseppe De Nicolao²•Institutions (2)

University of Padua¹, University of Pavia²

01 Feb 2011-Automatica

TL;DR: A novel Bayesian paradigm for the identification of output error models is applied to the design of optimal predictors and discrete-time models based on prediction error minimization by interpreting the predictor impulse responses as realizations of Gaussian processes.

...read moreread less

Journal Article•DOI•

Robust Shrinkage Estimation of High-Dimensional Covariance Matrices

[...]

Yilun Chen¹, Ami Wiesel², Alfred O. Hero¹•Institutions (2)

University of Michigan¹, Hebrew University of Jerusalem²

01 Sep 2011-IEEE Transactions on Signal Processing

TL;DR: This work addresses high dimensional covariance estimation for elliptical distributed samples, which are also known as spherically invariant random vectors (SIRV) or compound-Gaussian processes and proposes a simple, closed-form and data dependent choice for the shrinkage coefficient, based on a minimum mean squared error framework.

...read moreread less

Abstract: We address high dimensional covariance estimation for elliptical distributed samples, which are also known as spherically invariant random vectors (SIRV) or compound-Gaussian processes Specifically we consider shrinkage methods that are suitable for high dimensional problems with a small number of samples (large p small n) We start from a classical robust covariance estimator [Tyler (1987)], which is distribution-free within the family of elliptical distribution but inapplicable when n <; p Using a shrinkage coefficient, we regularize Tyler's fixed-point iterations We prove that, for all n and p , the proposed fixed-point iterations converge to a unique limit regardless of the initial condition Next, we propose a simple, closed-form and data dependent choice for the shrinkage coefficient, which is based on a minimum mean squared error framework Simulations demonstrate that the proposed method achieves low estimation error and is robust to heavy-tailed samples Finally, as a real-world application we demonstrate the performance of the proposed technique in the context of activity/intrusion detection using a wireless sensor network

...read moreread less

Proceedings Article•

Gaussian Process Training with Input Noise

[...]

Andrew Mchutchon¹, Carl Edward Rasmussen¹•Institutions (1)

University of Cambridge¹

12 Dec 2011

TL;DR: This work presents a simple yet effective GP model for training on input points corrupted by i.i.d. Gaussian noise, and compares it to others over a range of different regression problems and shows that it improves over current methods.

...read moreread less

Abstract: In standard Gaussian Process regression input locations are assumed to be noise free. We present a simple yet effective GP model for training on input points corrupted by i.i.d. Gaussian noise. To make computations tractable we use a local linear expansion about each input point. This allows the input noise to be recast as output noise proportional to the squared gradient of the GP posterior mean. The input noise variances are inferred from the data as extra hyperparameters. They are trained alongside other hyperparameters by the usual method of maximisation of the marginal likelihood. Training uses an iterative scheme, which alternates between optimising the hyperparameters and calculating the posterior gradient. Analytic predictive moments can then be found for Gaussian distributed test points. We compare our model to others over a range of different regression problems and show that it improves over current methods.

...read moreread less

Book Chapter•DOI•

On Some Global Measures of the Deviations of Density Function Estimates

[...]

Richard A. Davis¹, Keh-Shin Lii², Dimitris N. Politis³•Institutions (3)

Columbia University¹, University of California², University of California, San Diego³

01 Jan 2011

TL;DR: In this paper, the authors consider density estimates of the usual type generated by a weight function and obtain limit theorems for the maximum of the normalized deviation of the estimate from its expected value, and for quadratic norms of the same quantity.

...read moreread less

Abstract: We consider density estimates of the usual type generated by a weight function. Limt theorems are obtained for the maximum of the normalized deviation of the estimate from its expected value, and for quadratic norms of the same quantity. Using these results we study the behavior of tests of goodness-of-fit and confidence regions based on these statistics. In particular, we obtain a procedure which uniformly improves the chi-square goodness-of-fit test when the number of observations and cells is large and yet remains insensitive to the estimation of nuisance parameters. A new limit theorem for the maximum absolute value of a type of nonstationary Gaussian process is also proved.

...read moreread less

Journal Article•DOI•

iLogDemons: A Demons-Based Registration Algorithm for Tracking Incompressible Elastic Biological Tissues

[...]

Tommaso Mansi¹, Xavier Pennec¹, Maxime Sermesant¹, Hervé Delingette¹, Nicholas Ayache¹ - Show less +1 more•Institutions (1)

French Institute for Research in Computer Science and Automation¹

01 Mar 2011-International Journal of Computer Vision

TL;DR: This work improves the logDemons by integrating elasticity and incompressibility for soft-tissue tracking, and replaces the Gaussian smoothing by an efficient elastic-like regulariser based on isotropic differential quadratic forms of vector fields.

...read moreread less

Abstract: Tracking soft tissues in medical images using non-linear image registration algorithms requires methods that are fast and provide spatial transformations consistent with the biological characteristics of the tissues. LogDemons algorithm is a fast non-linear registration method that computes diffeomorphic transformations parameterised by stationary velocity fields. Although computationally efficient, its use for tissue tracking has been limited because of its ad-hoc Gaussian regularisation, which hampers the implementation of more biologically motivated regularisations. In this work, we improve the logDemons by integrating elasticity and incompressibility for soft-tissue tracking. To that end, a mathematical justification of demons Gaussian regularisation is proposed. Building on this result, we replace the Gaussian smoothing by an efficient elastic-like regulariser based on isotropic differential quadratic forms of vector fields. The registration energy functional is finally minimised under the divergence-free constraint to get incompressible deformations. As the elastic regulariser and the constraint are linear, the method remains computationally tractable and easy to implement. Tests on synthetic incompressible deformations showed that our approach outperforms the original logDemons in terms of elastic incompressible deformation recovery without reducing the image matching accuracy. As an application, we applied the proposed algorithm to estimate 3D myocardium strain on clinical cine MRI of two adult patients. Results showed that incompressibility constraint improves the cardiac motion recovery when compared to the ground truth provided by 3D tagged MRI.

...read moreread less

Proceedings Article•

Spike and Slab Variational Inference for Multi-Task and Multiple Kernel Learning

[...]

Miguel Lázaro-Gredilla¹, Michalis K. Titsias²•Institutions (2)

Complutense University of Madrid¹, University of Manchester²

12 Dec 2011

TL;DR: A variational Bayesian inference algorithm which can be widely applied to sparse linear models and is based on the spike and slab prior, which is the golden standard for sparse inference is introduced.

...read moreread less

Abstract: We introduce a variational Bayesian inference algorithm which can be widely applied to sparse linear models. The algorithm is based on the spike and slab prior which, from a Bayesian perspective, is the golden standard for sparse inference. We apply the method to a general multi-task and multiple kernel learning model in which a common set of Gaussian process functions is linearly combined with task-specific sparse weights, thus inducing relation between tasks. This model unifies several sparse linear models, such as generalized linear models, sparse factor analysis and matrix factorization with missing values, so that the variational algorithm can be applied to all these cases. We demonstrate our approach in multi-output Gaussian process regression, multi-class classification, image processing applications and collaborative filtering.

...read moreread less

Proceedings Article•DOI•

Online domain adaptation of a pre-trained cascade of classifiers

[...]

Vidit Jain¹, Erik Learned-Miller²•Institutions (2)

Yahoo!¹, University of Massachusetts Amherst²

20 Jun 2011

TL;DR: This work presents an on-line approach for rapidly adapting a “black box” classifier to a new test data set without retraining the classifier or examining the original optimization criterion.

...read moreread less

Abstract: Many classifiers are trained with massive training sets only to be applied at test time on data from a different distribution. How can we rapidly and simply adapt a classifier to a new test distribution, even when we do not have access to the original training data? We present an on-line approach for rapidly adapting a “black box” classifier to a new test data set without retraining the classifier or examining the original optimization criterion. Assuming the original classifier outputs a continuous number for which a threshold gives the class, we reclassify points near the original boundary using a Gaussian process regression scheme. We show how this general procedure can be used in the context of a classifier cascade, demonstrating performance that far exceeds state-of-the-art results in face detection on a standard data set. We also draw connections to work in semi-supervised learning, domain adaptation, and information regularization.

...read moreread less

Journal Article•DOI•

Robust Detection of Abandoned and Removed Objects in Complex Surveillance Videos

[...]

Yingli Tian¹, Rogerio Feris¹, Haowei Liu², Arun Hampapur¹, Ming-Ting Sun² - Show less +1 more•Institutions (2)

IBM¹, University of Washington²

01 Sep 2011

TL;DR: A new framework to robustly and efficiently detect abandoned and removed objects based on background subtraction (BGS) and foreground analysis with complement of tracking to reduce false positives is presented.

...read moreread less

Abstract: Tracking-based approaches for abandoned object detection often become unreliable in complex surveillance videos due to occlusions, lighting changes, and other factors. We present a new framework to robustly and efficiently detect abandoned and removed objects based on background subtraction (BGS) and foreground analysis with complement of tracking to reduce false positives. In our system, the background is modeled by three Gaussian mixtures. In order to handle complex situations, several improvements are implemented for shadow removal, quick-lighting change adaptation, fragment reduction, and keeping a stable update rate for video streams with different frame rates. Then, the same Gaussian mixture models used for BGS are employed to detect static foreground regions without extra computation cost. Furthermore, the types of the static regions (abandoned or removed) are determined by using a method that exploits context information about the foreground masks, which significantly outperforms previous edge-based techniques. Based on the type of the static regions and user-defined parameters (e.g., object size and abandoned time), a matching method is proposed to detect abandoned and removed objects. A person-detection process is also integrated to distinguish static objects from stationary people. The robustness and efficiency of the proposed method is tested on IBM Smart Surveillance Solutions for public safety applications in big cities and evaluated by several public databases, such as The Image library for intelligent detection systems (i-LIDS) and IEEE Performance Evaluation of Tracking and Surveillance Workshop (PETS) 2006 datasets. The test and evaluation demonstrate our method is efficient to run in real-time, while being robust to quick-lighting changes and occlusions in complex environments.

...read moreread less

Proceedings Article•DOI•

Gaussian process regression flow for analysis of motion trajectories

[...]

Kihwan Kim¹, Dongryeol Lee¹, Irfan Essa¹•Institutions (1)

Georgia Institute of Technology¹

06 Nov 2011

TL;DR: This paper model a trajectory as a continuous dense flow field from a sparse set of vector sequences using Gaussian Process Regression and introduces a random sampling strategy for learning stable classes of motions from limited data.

...read moreread less

Abstract: Recognition of motions and activities of objects in videos requires effective representations for analysis and matching of motion trajectories. In this paper, we introduce a new representation specifically aimed at matching motion trajectories. We model a trajectory as a continuous dense flow field from a sparse set of vector sequences using Gaussian Process Regression. Furthermore, we introduce a random sampling strategy for learning stable classes of motions from limited data. Our representation allows for incrementally predicting possible paths and detecting anomalous events from online trajectories. This representation also supports matching of complex motions with acceleration changes and pauses or stops within a trajectory. We use the proposed approach for classifying and predicting motion trajectories in traffic monitoring domains and test on several data sets. We show that our approach works well on various types of complete and incomplete trajectories from a variety of video data sets with different frame rates.

...read moreread less

Journal Article•DOI•

Universality of Wigner random matrices: a survey of recent results

[...]

Laszlo Erdős

30 Jun 2011-Russian Mathematical Surveys

TL;DR: In this article, the authors studied the universality of spectral statistics for Wigner matrices with independent identically distributed entries, where the probability distribution of each matrix element is given by a measure with zero expectation and with subexponential decay.

...read moreread less

Abstract: This is a study of the universality of spectral statistics for large random matrices. Considered are symmetric, Hermitian, or quaternion self-dual random matrices with independent identically distributed entries (Wigner matrices), where the probability distribution of each matrix element is given by a measure with zero expectation and with subexponential decay. The main result is that the correlation functions of the local eigenvalue statistics in the bulk of the spectrum coincide with those of the Gaussian Orthogonal Ensemble (GOE), the Gaussian Unitary Ensemble (GUE), and the Gaussian Symplectic Ensemble (GSE), respectively, in the limit as . This approach is based on a study of the Dyson Brownian motion via a related new dynamics, the local relaxation flow. As a main input, it is established that the density of the eigenvalues converges to the Wigner semicircle law, and this holds even down to the smallest possible scale. Moreover, it is shown that the eigenvectors are completely delocalized. These results hold even without the condition that the matrix elements are identically distributed: only independence is used. In fact, for the matrix elements of the Green function strong estimates are given that imply that the local statistics of any two ensembles in the bulk are identical if the first four moments of the matrix elements match. Universality at the spectral edges requires matching only two moments. A Wigner-type estimate is also proved, and it is shown that the eigenvalues repel each other on arbitrarily small scales. Bibliography: 108 titles.

...read moreread less

Journal Article•DOI•

Information Rates of Nonparametric Gaussian Process Methods

[...]

Aad van der Vaart, Harry van Zanten

01 Feb 2011-Journal of Machine Learning Research

TL;DR: The results show that for good performance, the regularity of the GP prior should match the regularities of the unknown response function, and is expressible in a certain concentration function.

...read moreread less

Abstract: We consider the quality of learning a response function by a nonparametric Bayesian approach using a Gaussian process (GP) prior on the response function. We upper bound the quadratic risk of the learning procedure, which in turn is an upper bound on the Kullback-Leibler information between the predictive and true data distribution. The upper bound is expressed in small ball probabilities and concentration measures of the GP prior. We illustrate the computation of the upper bound for the Matern and squared exponential kernels. For these priors the risk, and hence the information criterion, tends to zero for all continuous response functions. However, the rate at which this happens depends on the combination of true response function and Gaussian prior, and is expressible in a certain concentration function. In particular, the results show that for good performance, the regularity of the GP prior should match the regularity of the unknown response function.

...read moreread less

Journal Article•DOI•

Multivariate online kernel density estimation with Gaussian kernels

[...]

Matej Kristan¹, Ales Leonardis¹, Danijel Skočaj¹•Institutions (1)

University of Ljubljana¹

01 Oct 2011-Pattern Recognition

TL;DR: In this article, the authors proposed an online kernel density estimation (KDE) method, which maintains and updates a non-parametric model of the observed data, from which the KDE can be calculated.

...read moreread less

Proceedings Article•DOI•

Gaussian process implicit surfaces for shape estimation and grasping

[...]

Stanimir Dragiev¹, Marc Toussaint¹, Michael Gienger²•Institutions (2)

Technical University of Berlin¹, Honda²

09 May 2011

TL;DR: This work considers Gaussian process implicit surface potentials as object shape representations and validates the shape estimation using Gaussian processes in a simulation on randomly sampled shapes and the grasp controller on a real robot with 7 doF arm and 7DoF hand.

...read moreread less

Abstract: The choice of an adequate object shape representation is critical for efficient grasping and robot manipulation. A good representation has to account for two requirements: it should allow uncertain sensory fusion in a probabilistic way and it should serve as a basis for efficient grasp and motion generation. We consider Gaussian process implicit surface potentials as object shape representations. Sensory observations condition the Gaussian process such that its posterior mean defines an implicit surface which becomes an estimate of the object shape. Uncertain visual, haptic and laser data can equally be fused in the same Gaussian process shape estimate. The resulting implicit surface potential can then be used directly as a basis for a reach and grasp controller, serving as an attractor for the grasp end-effectors and steering the orientation of contact points. Our proposed controller results in a smooth reach and grasp trajectory without strict separation of phases. We validate the shape estimation using Gaussian processes in a simulation on randomly sampled shapes and the grasp controller on a real robot with 7DoF arm and 7DoF hand.

...read moreread less

Journal Article•DOI•

A Bayesian nonparametric approach to modeling motion patterns

[...]

Joshua Joseph¹, Finale Doshi-Velez¹, Albert S. Huang¹, Nicholas Roy¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Nov 2011-Autonomous Robots

TL;DR: This work proposes modeling target motion patterns as a mixture of Gaussian processes with a Dirichlet process prior over mixture weights, which provides an adaptive representation for each individual motion pattern and automatically adjusts the complexity of the motion model based on the available data.

...read moreread less

Abstract: The most difficult--and often most essential--aspect of many interception and tracking tasks is constructing motion models of the targets Experts rarely can provide complete information about a target's expected motion pattern, and fitting parameters for complex motion patterns can require large amounts of training data Specifying how to parameterize complex motion patterns is in itself a difficult task In contrast, Bayesian nonparametric models of target motion are very flexible and generalize well with relatively little training data We propose modeling target motion patterns as a mixture of Gaussian processes (GP) with a Dirichlet process (DP) prior over mixture weights The GP provides an adaptive representation for each individual motion pattern, while the DP prior allows us to represent an unknown number of motion patterns Both automatically adjust the complexity of the motion model based on the available data Our approach outperforms several parametric models on a helicopter-based car-tracking task on data collected from the greater Boston area

...read moreread less

Journal Article•DOI•

Measurement Matrix Design for Compressive Sensing–Based MIMO Radar

[...]

Yao Yu¹, Athina P. Petropulu¹, H.V. Poor²•Institutions (2)

Rutgers University¹, Princeton University²

01 Nov 2011-IEEE Transactions on Signal Processing

TL;DR: Simulations indicate that the proposed measurement matrices can improve detection accuracy as compared to a GRMM and can significantly improve the SIR while maintaining a CSM comparable to that of the Gaussian random measurement matrix (GRMM).

...read moreread less

Abstract: In colocated multiple-input multiple-output (MIMO) radar using compressive sensing (CS), a receive node compresses its received signal via a linear transformation, referred to as a measurement matrix. The samples are subsequently forwarded to a fusion center, where an l1-optimization problem is formulated and solved for target information. CS-based MIMO radar exploits target sparsity in the angle-Doppler-range space and thus achieves the high localization performance of traditional MIMO radar but with significantly fewer measurements. The measurement matrix affects the recovery performance. A random Gaussian measurement matrix, typically used in CS problems, does not necessarily result in the best possible detection performance for the basis matrix corresponding to the MIMO radar scenario. This paper considers optimal measurement matrix design with the optimality criterion depending on the coherence of the sensing matrix (CSM) and/or signal-to-interference ratio (SIR). Two approaches are proposed: the first one minimizes a linear combination of CSM and the inverse SIR, and the second one imposes a structure on the measurement matrix and determines the parameters involved so that the SIR is enhanced. Depending on the transmit waveforms, the second approach can significantly improve the SIR, while maintaining a CSM comparable to that of the Gaussian random measurement matrix (GRMM). Simulations indicate that the proposed measurement matrices can improve detection accuracy as compared to a GRMM.

...read moreread less

Journal Article•DOI•

Geometric and Dynamical Models of Reverberation Mapping Data

[...]

Anna Pancoast¹, Brendon J. Brewer¹, Tommaso Treu•Institutions (1)

University of California, Santa Barbara¹

01 Apr 2011-The Astrophysical Journal

TL;DR: In this article, the authors present a general method to analyze reverberation (or echo) mapping data that simultaneously provides estimates for the black hole mass and for the geometry and dynamics of the broad line region (BLR) in active galactic nuclei (AGNs).

...read moreread less

Abstract: We present a general method to analyze reverberation (or echo) mapping data that simultaneously provides estimates for the black hole mass and for the geometry and dynamics of the broad-line region (BLR) in active galactic nuclei (AGNs). While previous methods yield a typical scale size of the BLR or a reconstruction of the transfer function, our method directly infers the spatial and velocity distribution of the BLR from the data, from which a transfer function can be easily derived. Previous echo mapping analysis requires an independent estimate of a scaling factor known as the virial coefficient to infer the mass of the black hole, but this is not needed in our more direct approach. We use the formalism of Bayesian probability theory and implement a Markov Chain Monte Carlo algorithm to obtain estimates and uncertainties for the parameters of our BLR models. Fitting of models to the data requires knowledge of the continuum flux at all times, not just the measured times. We use Gaussian Processes to interpolate and extrapolate the continuum light curve data in a fully consistent probabilistic manner, taking the associated errors into account. We illustrate our method using simple models of BLR geometry and dynamics and show that we can recover the parameter values of our test systems with realistic uncertainties that depend upon the variability of the AGN and the quality of the reverberation mapping observing campaign. With a geometry model we can recover the mean radius of the BLR to within ~0.1 dex random uncertainty for simulated data with an integrated line flux uncertainty of 1.5%, while with a dynamical model we can recover the black hole mass and the mean radius to within ~0.05 dex random uncertainty, for simulated data with a line profile average signal-to-noise ratio of 4 per spectral pixel. These uncertainties do not include modeling errors, which are likely to be present in the analysis of real data, and should therefore be considered as lower limits to the accuracy of the method.

...read moreread less

Collapse