scispace - formally typeset
Search or ask a question
Author

Sara Martino

Bio: Sara Martino is an academic researcher from Norwegian University of Science and Technology. The author has contributed to research in topics: Bayesian inference & Laplace's method. The author has an hindex of 13, co-authored 31 publications receiving 4387 citations. Previous affiliations of Sara Martino include Stazione Zoologica Anton Dohrn & SINTEF.

Papers
More filters
Journal ArticleDOI
TL;DR: This work considers approximate Bayesian inference in a popular subset of structured additive regression models, latent Gaussian models, where the latent field is Gaussian, controlled by a few hyperparameters and with non‐Gaussian response variables and can directly compute very accurate approximations to the posterior marginals.
Abstract: Structured additive regression models are perhaps the most commonly used class of models in statistical applications. It includes, among others, (generalized) linear models, (generalized) additive models, smoothing spline models, state space models, semiparametric regression, spatial and spatiotemporal models, log-Gaussian Cox processes and geostatistical and geoadditive models. We consider approximate Bayesian inference in a popular subset of structured additive regression models, latent Gaussian models, where the latent field is Gaussian, controlled by a few hyperparameters and with non-Gaussian response variables. The posterior marginals are not available in closed form owing to the non-Gaussian response variables. For such models, Markov chain Monte Carlo methods can be implemented, but they are not without problems, in terms of both convergence and computational time. In some practical applications, the extent of these problems is such that Markov chain Monte Carlo sampling is simply not an appropriate tool for routine analysis. We show that, by using an integrated nested Laplace approximation and its simplified version, we can directly compute very accurate approximations to the posterior marginals. The main benefit of these approximations is computational: where Markov chain Monte Carlo algorithms need hours or days to run, our approximations provide more precise estimates in seconds or minutes. Another advantage with our approach is its generality, which makes it possible to perform Bayesian analysis in an automatic, streamlined way, and to compute model comparison criteria and various predictive measures so that models can be compared and the model under study can be challenged.

4,164 citations

01 Jan 2007
TL;DR: The approximation tool for latent GMRF models is introduced and the approximation for the posterior of the hyperparameters θ in equation (1) is shown to give extremely accurate results in a fraction of the computing time used by MCMC algorithms.
Abstract: This thesis consists of five papers, presented in chronological order. Their content is summarised in this section.Paper I introduces the approximation tool for latent GMRF models and discusses, in particular, the approximation for the posterior of the hyperparameters θ in equation (1). It is shown that this approximation is indeed very accurate, as even long MCMC runs cannot detect any error in it. A Gaussian approximation to the density of χi|θ, y is also discussed. This appears to give reasonable results and it is very fast to compute. However, slight errors are detected when comparing the approximation with long MCMC runs. These are mostly due to the fact that a possible - skewed density is approximated via a symmetric one. Paper I presents also some details about sparse matrices algorithms.The core of the thesis is presented in Paper II. Here most of the remaining issues present in Paper I are solved. Three different approximation for χi|θ, y with different degrees of accuracy and computational costs are described. Moreover, ways to assess the approximation error and considerations about the asymptotical behaviour of the approximations are also discussed. Through a series of examples covering a wide range of commonly used latent GMRF models, the approximations are shown to give extremely accurate results in a fraction of the computing time used by MCMC algorithms.Paper III applies the same ideas as Paper II to generalised linear mixed models where χ represents a latent variable at n spatial sites on a two dimensional domain. Out of these n sites k, with n >> k , are observed through data. The n sites are assumed to be on a regular grid and wrapped on a torus. For the class of models described in Paper III the computations are based on discrete Fourier transform instead of sparse matrices. Paper III illustrates also how marginal likelihood π (y) can be approximated, provides approximate strategies for Bayesian outlier detection and perform approximate evaluation of spatial experimental design.Paper IV presents yet another application of the ideas in Paper II. Here approximate techniques are used to do inference on multivariate stochastic volatility models, a class of models widely used in financial applications. Paper IV discusses also problems deriving from the increased dimension of the parameter vector θ, a condition which makes all numerical integration more computationally intensive. Different approximations for the posterior marginals of the parameters θ, π(θi)|y), are also introduced. Approximations to the marginal likelihood π(y) are used in order to perform model comparison.Finally, Paper V is a manual for a program, named inla which implements all approximations described in Paper II. A large series of worked out examples, covering many well known models, illustrate the use and the performance of the inla program. This program is a valuable instrument since it makes most of the Bayesian inference techniques described in this thesis easily available for everyone.

320 citations

Journal ArticleDOI
TL;DR: It is conjecture that for many hierarchical GMRF-models there is really no need for MCMC based inference to estimate marginal densities, and by making use of numerical methods for sparse matrices the computational costs of these deterministic schemes are nearly instant compared to the MCMC alternative.

216 citations

Journal ArticleDOI
TL;DR: This article shows how a new inferential tool named integrated nested Laplace approximations can be adapted and applied to many survival models making Bayesian analysis both fast and accurate without having to rely on MCMC‐based inference.
Abstract: . Bayesian analysis of time-to-event data, usually called survival analysis, has received increasing attention in the last years. In Cox-type models it allows to use information from the full likelihood instead of from a partial likelihood, so that the baseline hazard function and the model parameters can be jointly estimated. In general, Bayesian methods permit a full and exact posterior inference for any parameter or predictive quantity of interest. On the other side, Bayesian inference often relies on Markov chain Monte Carlo (MCMC) techniques which, from the user point of view, may appear slow at delivering answers. In this article, we show how a new inferential tool named integrated nested Laplace approximations can be adapted and applied to many survival models making Bayesian analysis both fast and accurate without having to rely on MCMC-based inference.

101 citations

01 Jan 2008
TL;DR: The approximation tool for latent GMRF models is introduced and the approximation for the posterior of the hyperparameters θ in equation (1) is shown to give extremely accurate results in a fraction of the computing time used by MCMC algorithms.
Abstract: This thesis consists of five papers, presented in chronological order. Their content is summarised in this section.Paper I introduces the approximation tool for latent GMRF models and discusses, in particular, the approximation for the posterior of the hyperparameters θ in equation (1). It is shown that this approximation is indeed very accurate, as even long MCMC runs cannot detect any error in it. A Gaussian approximation to the density of χi|θ, y is also discussed. This appears to give reasonable results and it is very fast to compute. However, slight errors are detected when comparing the approximation with long MCMC runs. These are mostly due to the fact that a possible - skewed density is approximated via a symmetric one. Paper I presents also some details about sparse matrices algorithms.The core of the thesis is presented in Paper II. Here most of the remaining issues present in Paper I are solved. Three different approximation for χi|θ, y with different degrees of accuracy and computational costs are described. Moreover, ways to assess the approximation error and considerations about the asymptotical behaviour of the approximations are also discussed. Through a series of examples covering a wide range of commonly used latent GMRF models, the approximations are shown to give extremely accurate results in a fraction of the computing time used by MCMC algorithms.Paper III applies the same ideas as Paper II to generalised linear mixed models where χ represents a latent variable at n spatial sites on a two dimensional domain. Out of these n sites k, with n >> k , are observed through data. The n sites are assumed to be on a regular grid and wrapped on a torus. For the class of models described in Paper III the computations are based on discrete Fourier transform instead of sparse matrices. Paper III illustrates also how marginal likelihood π (y) can be approximated, provides approximate strategies for Bayesian outlier detection and perform approximate evaluation of spatial experimental design.Paper IV presents yet another application of the ideas in Paper II. Here approximate techniques are used to do inference on multivariate stochastic volatility models, a class of models widely used in financial applications. Paper IV discusses also problems deriving from the increased dimension of the parameter vector θ, a condition which makes all numerical integration more computationally intensive. Different approximations for the posterior marginals of the parameters θ, π(θi)|y), are also introduced. Approximations to the marginal likelihood π(y) are used in order to perform model comparison.Finally, Paper V is a manual for a program, named inla which implements all approximations described in Paper II. A large series of worked out examples, covering many well known models, illustrate the use and the performance of the inla program. This program is a valuable instrument since it makes most of the Bayesian inference techniques described in this thesis easily available for everyone.

81 citations


Cited by
More filters
Christopher M. Bishop1
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations

Book
24 Aug 2012
TL;DR: This textbook offers a comprehensive and self-contained introduction to the field of machine learning, based on a unified, probabilistic approach, and is suitable for upper-level undergraduates with an introductory-level college math background and beginning graduate students.
Abstract: Today's Web-enabled deluge of electronic data calls for automated methods of data analysis. Machine learning provides these, developing methods that can automatically detect patterns in data and then use the uncovered patterns to predict future data. This textbook offers a comprehensive and self-contained introduction to the field of machine learning, based on a unified, probabilistic approach. The coverage combines breadth and depth, offering necessary background material on such topics as probability, optimization, and linear algebra as well as discussion of recent developments in the field, including conditional random fields, L1 regularization, and deep learning. The book is written in an informal, accessible style, complete with pseudo-code for the most important algorithms. All topics are copiously illustrated with color images and worked examples drawn from such application domains as biology, text processing, computer vision, and robotics. Rather than providing a cookbook of different heuristic methods, the book stresses a principled model-based approach, often using the language of graphical models to specify models in a concise and intuitive way. Almost all the models described have been implemented in a MATLAB software package--PMTK (probabilistic modeling toolkit)--that is freely available online. The book is suitable for upper-level undergraduates with an introductory-level college math background and beginning graduate students.

8,059 citations

Journal ArticleDOI

6,278 citations

Journal ArticleDOI
Simon N. Wood1
TL;DR: In this article, a Laplace approximation is used to obtain an approximate restricted maximum likelihood (REML) or marginal likelihood (ML) for smoothing parameter selection in semiparametric regression.
Abstract: Summary. Recent work by Reiss and Ogden provides a theoretical basis for sometimes preferring restricted maximum likelihood (REML) to generalized cross-validation (GCV) for smoothing parameter selection in semiparametric regression. However, existing REML or marginal likelihood (ML) based methods for semiparametric generalized linear models (GLMs) use iterative REML or ML estimation of the smoothing parameters of working linear approximations to the GLM. Such indirect schemes need not converge and fail to do so in a non-negligible proportion of practical analyses. By contrast, very reliable prediction error criteria smoothing parameter selection methods are available, based on direct optimization of GCV, or related criteria, for the GLM itself. Since such methods directly optimize properly defined functions of the smoothing parameters, they have much more reliable convergence properties. The paper develops the first such method for REML or ML estimation of smoothing parameters. A Laplace approximation is used to obtain an approximate REML or ML for any GLM, which is suitable for efficient direct optimization. This REML or ML criterion requires that Newton–Raphson iteration, rather than Fisher scoring, be used for GLM fitting, and a computationally stable approach to this is proposed. The REML or ML criterion itself is optimized by a Newton method, with the derivatives required obtained by a mixture of implicit differentiation and direct methods. The method will cope with numerical rank deficiency in the fitted model and in fact provides a slight improvement in numerical robustness on the earlier method of Wood for prediction error criteria based smoothness selection. Simulation results suggest that the new REML and ML methods offer some improvement in mean-square error performance relative to GCV or Akaike's information criterion in most cases, without the small number of severe undersmoothing failures to which Akaike's information criterion and GCV are prone. This is achieved at the same computational cost as GCV or Akaike's information criterion. The new approach also eliminates the convergence failures of previous REML- or ML-based approaches for penalized GLMs and usually has lower computational cost than these alternatives. Example applications are presented in adaptive smoothing, scalar on function regression and generalized additive model selection.

4,846 citations

Journal ArticleDOI
TL;DR: The glmmTMB package fits many types of GLMMs and extensions, including models with continuously distributed responses, but here the authors focus on count responses and its ability to estimate the Conway-Maxwell-Poisson distribution parameterized by the mean is unique.
Abstract: Count data can be analyzed using generalized linear mixed models when observations are correlated in ways that require random effects However, count data are often zero-inflated, containing more zeros than would be expected from the typical error distributions We present a new package, glmmTMB, and compare it to other R packages that fit zero-inflated mixed models The glmmTMB package fits many types of GLMMs and extensions, including models with continuously distributed responses, but here we focus on count responses glmmTMB is faster than glmmADMB, MCMCglmm, and brms, and more flexible than INLA and mgcv for zero-inflated modeling One unique feature of glmmTMB (among packages that fit zero-inflated mixed models) is its ability to estimate the Conway-Maxwell-Poisson distribution parameterized by the mean Overall, its most appealing features for new users may be the combination of speed, flexibility, and its interface’s similarity to lme4

4,497 citations