scispace - formally typeset
Search or ask a question
Author

W. R. Gilks

Bio: W. R. Gilks is an academic researcher. The author has contributed to research in topics: Statistical model & Statistical inference. The author has an hindex of 1, co-authored 1 publications receiving 654 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: A robust nonlinear full probability model for population pharmacokinetic data is proposed and it is demonstrated that the method enables Bayesian inference for this model, through an analysis of antibiotic administration in new‐born babies.
Abstract: Gibbs sampling is a powerful technique for statistical inference. It involves little more than sampling from full conditional distributions, which can be both complex and computationally expensive to evaluate. Gilks and Wild have shown that in practice full conditionals are often log‐concave, and they proposed a method of adaptive rejection sampling for efficiently sampling from univariate log‐concave distributions. In this paper, to deal with non‐log‐concave full conditional distributions, we generalize adaptive rejection sampling to include a Hastings‐Metropolis algorithm step. One important field of application in which statistical models may lead to non‐log‐concave full conditionals is population pharmacokinetics. Here, the relationship between drug dose and blood or plasma concentration in a group of patients typically is modelled by using nonlinear mixed effects models. Often, the data used for analysis are routinely collected hospital measurements, which tend to be noisy and irregular. Consequently, a robust (t‐distributed) error structure is appropriate to account for outlying observations and/or patients. We propose a robust nonlinear full probability model for population pharmacokinetic data. We demonstrate that our method enables Bayesian inference for this model, through an analysis of antibiotic administration in new‐born babies.

687 citations


Cited by
More filters
Dissertation
01 Jan 2003
TL;DR: A unified variational Bayesian (VB) framework which approximates computations in models with latent variables using a lower bound on the marginal likelihood and is compared to other methods including sampling, Cheeseman-Stutz, and asymptotic approximations such as BIC.
Abstract: The Bayesian framework for machine learning allows for the incorporation of prior knowledge in a coherent way, avoids overfitting problems, and provides a principled basis for selecting between alternative models. Unfortunately the computations required are usually intractable. This thesis presents a unified variational Bayesian (VB) framework which approximates these computations in models with latent variables using a lower bound on the marginal likelihood. Chapter 1 presents background material on Bayesian inference, graphical models, and propagation algorithms. Chapter 2 forms the theoretical core of the thesis, generalising the expectation- maximisation (EM) algorithm for learning maximum likelihood parameters to the VB EM algorithm which integrates over model parameters. The algorithm is then specialised to the large family of conjugate-exponential (CE) graphical models, and several theorems are presented to pave the road for automated VB derivation procedures in both directed and undirected graphs (Bayesian and Markov networks, respectively). Chapters 3–5 derive and apply the VB EM algorithm to three commonly-used and important models: mixtures of factor analysers, linear dynamical systems, and hidden Markov models. It is shown how model selection tasks such as determining the dimensionality, cardinality, or number of variables are possible using VB approximations. Also explored are methods for combining sampling procedures with variational approximations, to estimate the tightness of VB bounds and to obtain more effective sampling algorithms. Chapter 6 applies VB learning to a long-standing problem of scoring discrete-variable directed acyclic graphs, and compares the performance to annealed importance sampling amongst other methods. Throughout, the VB approximation is compared to other methods including sampling, Cheeseman-Stutz, and asymptotic approximations such as BIC. The thesis concludes with a discussion of evolving directions for model selection including infinite models and alternative approximations to the marginal likelihood.

1,930 citations

Journal ArticleDOI
TL;DR: In this paper, Markov chain Monte Carlo sampling methods are exploited to provide a unified, practical likelihood-based framework for the analysis of stochastic volatility models, and a highly effective method is developed that samples all the unobserved volatilities at once using an approximate offset mixture model, followed by an importance reweighting procedure.
Abstract: In this paper, Markov chain Monte Carlo sampling methods are exploited to provide a unified, practical likelihood-based framework for the analysis of stochastic volatility models. A highly effective method is developed that samples all the unobserved volatilities at once using an approximating offset mixture model, followed by an importance reweighting procedure. This approach is compared with several alternative methods using real data. The paper also develops simulation-based methods for filtering, likelihood evaluation and model failure diagnostics. The issue of model choice using non-nested likelihood ratios and Bayes factors is also investigated. These methods are used to compare the fit of stochastic volatility and GARCH models. All the procedures are illustrated in detail.

1,892 citations

Journal ArticleDOI
TL;DR: A general alternating expectation–conditional maximization algorithm AECM is formulated that couples flexible data augmentation schemes with model reduction schemes to achieve efficient computations and shows the potential for a dramatic reduction in computational time with little increase in human effort.
Abstract: Celebrating the 20th anniversary of the presentation of the paper by Dempster, Laird and Rubin which popularized the EM algorithm, we investigate, after a brief historical account, strategies that aim to make the EM algorithm converge faster while maintaining its simplicity and stability (e.g. automatic monotone convergence in likelihood). First we introduce the idea of a ‘working parameter’ to facilitate the search for efficient data augmentation schemes and thus fast EM implementations. Second, summarizing various recent extensions of the EM algorithm, we formulate a general alternating expectation–conditional maximization algorithm AECM that couples flexible data augmentation schemes with model reduction schemes to achieve efficient computations. We illustrate these methods using multivariate t-models with known or unknown degrees of freedom and Poisson models for image reconstruction. We show, through both empirical and theoretical evidence, the potential for a dramatic reduction in computational time with little increase in human effort. We also discuss the intrinsic connection between EM-type algorithms and the Gibbs sampler, and the possibility of using the techniques presented here to speed up the latter. The main conclusion of the paper is that, with the help of statistical considerations, it is possible to construct algorithms that are simple, stable and fast.

775 citations

01 Jan 2009
TL;DR: Presents parameter estimation methods common with discrete proba- bility distributions, which is of particular interest in text modeling, and central concepts like conjugate distributions and Bayesian networks are reviewed.
Abstract: Presents parameter estimation methods common with discrete proba- bility distributions, which is of particular interest in text modeling. Starting with maximum likelihood, a posteriori and Bayesian estimation, central concepts like conjugate distributions and Bayesian networks are reviewed. As an application, the model of latent Dirichlet allocation (LDA) is explained in detail with a full derivation of an approximate inference algorithm based on Gibbs sampling, in- cluding a discussion of Dirichlet hyperparameter estimation. Finally, analysis methods of LDA models are discussed.

761 citations

Journal ArticleDOI
TL;DR: In this paper, the authors proposed a statistical method to estimate the ancestral origin of a locus on the basis of the composite genotypes of linked markers, and showed that this approach accurately estimates states of ancestral origin along the genome.
Abstract: Admixture mapping (also known as “mapping by admixture linkage disequilibrium,” or MALD) has been proposed as an efficient approach to localizing disease-causing variants that differ in frequency (because of either drift or selection) between two historically separated populations. Near a disease gene, patient populations descended from the recent mixing of two or more ethnic groups should have an increased probability of inheriting the alleles derived from the ethnic group that carries more disease-susceptibility alleles. The central attraction of admixture mapping is that, since gene flow has occurred recently in modern populations (e.g., in African and Hispanic Americans in the past 20 generations), it is expected that admixture-generated linkage disequilibrium should extend for many centimorgans. High-resolution marker sets are now becoming available to test this approach, but progress will require (a) computational methods to infer ancestral origin at each point in the genome and (b) empirical characterization of the general properties of linkage disequilibrium due to admixture. Here we describe statistical methods to estimate the ancestral origin of a locus on the basis of the composite genotypes of linked markers, and we show that this approach accurately estimates states of ancestral origin along the genome. We apply this approach to show that strong admixture linkage disequilibrium extends, on average, for 17 cM in African Americans. Finally, we present power calculations under varying models of disease risk, sample size, and proportions of ancestry. Studying ∼2,500 markers in ∼2,500 patients should provide power to detect many regions contributing to common disease. A particularly important result is that the power of an admixture mapping study to detect a locus will be nearly the same for a wide range of mixture scenarios: the mixture proportion should be 10%–90% from both ancestral populations.

476 citations