scispace - formally typeset
Search or ask a question

Showing papers on "Variable-order Bayesian network published in 2006"


Book
08 Aug 2006
TL;DR: This book should help newcomers to the field to understand how finite mixture and Markov switching models are formulated, what structures they imply on the data, what they could be used for, and how they are estimated.
Abstract: WINNER OF THE 2007 DEGROOT PRIZE! The prominence of finite mixture modelling is greater than ever. Many important statistical topics like clustering data, outlier treatment, or dealing with unobserved heterogeneity involve finite mixture models in some way or other. The area of potential applications goes beyond simple data analysis and extends to regression analysis and to non-linear time series analysis using Markov switching models. For more than the hundred years since Karl Pearson showed in 1894 how to estimate the five parameters of a mixture of two normal distributions using the method of moments, statistical inference for finite mixture models has been a challenge to everybody who deals with them. In the past ten years, very powerful computational tools emerged for dealing with these models which combine a Bayesian approach with recent Monte simulation techniques based on Markov chains. This book reviews these techniques and covers the most recent advances in the field, among them bridge sampling techniques and reversible jump Markov chain Monte Carlo methods. It is the first time that the Bayesian perspective of finite mixture modelling is systematically presented in book form. It is argued that the Bayesian approach provides much insight in this context and is easily implemented in practice. Although the main focus is on Bayesian inference, the author reviews several frequentist techniques, especially selecting the number of components of a finite mixture model, and discusses some of their shortcomings compared to the Bayesian approach. The aim of this book is to impart the finite mixture and Markov switching approach to statistical modelling to a wide-ranging community. This includes not only statisticians, but also biologists, economists, engineers, financial agents, market researcher, medical researchers or any other frequent user of statistical models. This book should help newcomers to the field to understand how finite mixture and Markov switching models are formulated, what structures they imply on the data, what they could be used for, and how they are estimated. Researchers familiar with the subject also will profit from reading this book. The presentation is rather informal without abandoning mathematical correctness. Previous notions of Bayesian inference and Monte Carlo simulation are useful but not needed.

1,642 citations


Journal ArticleDOI
TL;DR: Using simulated datasets, the Bayesian method generally fares better than the ML approach in accuracy and coverage, although for some values the two approaches are equal in performance.
Abstract: Comparison of the performance and accuracy of different inference methods, such as maximum likelihood (ML) and Bayesian inference, is difficult because the inference methods are implemented in different programs, often written by different authors. Both methods were implemented in the program MIGRATE, that estimates population genetic parameters, such as population sizes and migration rates, using coalescence theory. Both inference methods use the same Markov chain Monte Carlo algorithm and differ from each other in only two aspects: parameter proposal distribution and maximization of the likelihood function. Using simulated datasets, the Bayesian method generally fares better than the ML approach in accuracy and coverage, although for some values the two approaches are equal in performance. Motivation: The Markov chain Monte Carlo-based ML framework can fail on sparse data and can deliver non-conservative support intervals. A Bayesian framework with appropriate prior distribution is able to remedy some of these problems. Results: The program MIGRATE was extended to allow not only for ML(-) maximum likelihood estimation of population genetics parameters but also for using a Bayesian framework. Comparisons between the Bayesian approach and the ML approach are facilitated because both modes estimate the same parameters under the same population model and assumptions. Availability: The program is available from http://popgen.csit.fsu.edu/ Contact: beerli@csit.fsu.edu

811 citations


Journal ArticleDOI
TL;DR: Introduction to Linear Models and Statistical Inference is not meant to compete with these texts—rather, its audience is primarily those taking a statistics course within a mathematics department.
Abstract: of the simple linear regression model. Multiple linear regression for two variables is discussed in Chapter 8, and that for more than two variables is covered in Chapter 9. Chapter 10, on model building, is perhaps the book’s strongest chapter. The authors provide one of the most intuitive discussions on variable transformations that I have seen. Nice presentations of indicator variables, variable selection, and influence diagnostics are also provided. The final chapter covers a wide variety of topics, including analysis of variance models, logistic regression, and robust regression. The coverage of regression is not matrix-based, but optional linear algebra sections at the end of each chapter are useful for one wishing to use matrices. In general, the writing is clear and conceptual. A good number of exercises (about 20 on average) at the end of each chapter are provided. The exercises emphasize derivations and computations. It is difficult to name some comparison texts. Certainly, the text by Ott and Longnecker (2001) would be more suitable for a statistical methods course for an interdisciplinary audience. The regression texts of Montgomery, Peck, and Vining (2001) and Mendenhall and Sincich (2003) are more comprehensive in the regression treatment than the reviewed text. However, Introduction to Linear Models and Statistical Inference is not meant to compete with these texts—rather, its audience is primarily those taking a statistics course within a mathematics department.

802 citations


Journal ArticleDOI
TL;DR: This tutorial is to introduce the more general reader to the Bayesian approach to quantifying, analysing and reducing uncertainty in the application of complex process models.

735 citations


01 Jan 2006
TL;DR: This paper shows that most of the proposed discrete time models — including the boolean network model [Kau93, SS96], the linear model of D’haeseleer et al.
Abstract: Recently, there has been much interest in reverse engineering genetic networks from time series data. In this paper, we show that most of the proposed discrete time models — including the boolean network model [Kau93, SS96], the linear model of D’haeseleer et al. [DWFS99], and the nonlinear model of Weaver et al. [WWS99] — are all special cases of a general class of models called Dynamic Bayesian Networks (DBNs). The advantages of DBNs include the ability to model stochasticity, to incorporate prior knowledge, and to handle hidden variables and missing data in a principled way. This paper provides a review of techniques for learning DBNs.

435 citations


Proceedings Article
13 Jul 2006
TL;DR: In this paper, the problem of learning the best Bayesian network structure with respect to a decomposable score such as BDe, BIC or AIC is studied, which is known to be NP-hard and becomes quickly infeasible as the number of variables increases.
Abstract: We study the problem of learning the best Bayesian network structure with respect to a decomposable score such as BDe, BIC or AIC. This problem is known to be NP-hard, which means that solving it becomes quickly infeasible as the number of variables increases. Nevertheless, in this paper we show that it is possible to learn the best Bayesian network structure with over 30 variables, which covers many practically interesting cases. Our algorithm is less complicated and more efficient than the techniques presented earlier. It can be easily parallelized, and offers a possibility for efficient exploration of the best networks consistent with different variable orderings. In the experimental part of the paper we compare the performance of the algorithm to the previous state-of-the-art algorithm. Free source-code and an online-demo can be found at http://b-course.hiit.fi/bene.

378 citations


Book
27 Jul 2006
TL;DR: This paper presents a meta-analyses of Bayesian inference and decision theory from a large number of perspectives on utility, prior, and Bayesian Robustness, and some applications.
Abstract: Statistical Preliminaries.- Bayesian Inference and Decision Theory.- Utility, Prior, and Bayesian Robustness.- Large Sample Methods.- Choice of Priors for Low-dimensional Parameters.- Hypothesis Testing and Model Selection.- Bayesian Computations.- Some Common Problems in Inference.- High-dimensional Problems.- Some Applications.

306 citations


Book ChapterDOI
01 Jan 2006
TL;DR: The general Bayesian approach to causal discovery is described and approximation methods for missing data and hidden variables are reviewed, and differences between the Bayesian and constraint-based methods are illustrated using artificial and real examples.
Abstract: We examine the Bayesian approach to the discovery of causal DAG models and compare it to the constraint-based approach. Both approaches rely on the Causal Markov condition, but the two differ significantly in theory and practice. An important difference between the approaches is that the constraint-based approach uses categorical information about conditional-independence constraints in the domain, whereas the Bayesian approach weighs the degree to which such constraints hold. As a result, the Bayesian approach has three distinct advantages over its constraint-based counterpart. One, conclusions derived from the Bayesian approach are not susceptible to incorrect categorical decisions about independence facts that can occur with data sets of finite size. Two, using the Bayesian approach, finer distinctions among model structures—both quantitative and qualitative—can be made. Three, information from several models can be combined to make better inferences and to better account for modeling uncertainty. In addition to describing the general Bayesian approach to causal discovery, we review approximation methods for missing data and hidden variables, and illustrate differences between the Bayesian and constraint-based methods using artificial and real examples.

292 citations


Proceedings Article
04 Dec 2006
TL;DR: This paper provides a computationally efficient method for learning Markov network structure from data based on the use of L1 regularization on the weights of the log-linear model, which achieves considerably higher generalization performance than the more standard L2-based method (a Gaussian parameter prior or pure maximum-likelihood learning).
Abstract: Markov networks are commonly used in a wide variety of applications, ranging from computer vision, to natural language, to computational biology. In most current applications, even those that rely heavily on learned models, the structure of the Markov network is constructed by hand, due to the lack of effective algorithms for learning Markov network structure from data. In this paper, we provide a computationally efficient method for learning Markov network structure from data. Our method is based on the use of L1 regularization on the weights of the log-linear model, which has the effect of biasing the model towards solutions where many of the parameters are zero. This formulation converts the Markov network learning problem into a convex optimization problem in a continuous space, which can be solved using efficient gradient methods. A key issue in this setting is the (unavoidable) use of approximate inference, which can lead to errors in the gradient computation when the network structure is dense. Thus, we explore the use of different feature introduction schemes and compare their performance. We provide results for our method on synthetic data, and on two real world data sets: pixel values in the MNIST data, and genetic sequence variations in the human HapMap data. We show that our L1 -based method achieves considerably higher generalization performance than the more standard L2-based method (a Gaussian parameter prior) or pure maximum-likelihood learning. We also show that we can learn MRF network structure at a computational cost that is not much greater than learning parameters alone, demonstrating the existence of a feasible method for this important problem.

269 citations


Journal ArticleDOI
TL;DR: The validation method is shown to find errors in software when they exist and, moreover, the validation output can be informative about the nature and location of such errors.
Abstract: This article presents a simulation-based method designed to establish the computational correctness of software developed to fit a specific Bayesian model, capitalizing on properties of Bayesian posterior distributions. We illustrate the validation technique with two examples. The validation method is shown to find errors in software when they exist and, moreover, the validation output can be informative about the nature and location of such errors. We also compare our method with that of an earlier approach.

262 citations


Reference EntryDOI
15 Aug 2006

Journal ArticleDOI
TL;DR: This review provides an introduction to the growing literature in Bayesian bioinformatics, with particular emphasis on recent developments in Bayes' inequality relevant to computational systems biology.
Abstract: Bayesian methods are valuable, inter alia, whenever there is a need to extract information from data that are uncertain or subject to any kind of error or noise (including measurement error and experimental error, as well as noise or random variation intrinsic to the process of interest). Bayesian methods offer a number of advantages over more conventional statistical techniques that make them particularly appropriate for complex data. It is therefore no surprise that Bayesian methods are becoming more widely used in the fields of genetics, genomics, bioinformatics and computational systems biology, where making sense of complex noisy data is the norm. This review provides an introduction to the growing literature in this area, with particular emphasis on recent developments in Bayesian bioinformatics relevant to computational systems biology.

Journal ArticleDOI
TL;DR: The results suggest that the Bayesian network model can predict maintainability more accurately than the regression-based models for one system, and almost as accurately as the best regression- based model for the other system.
Abstract: As the number of object-oriented software systems increases, it becomes more important for organizations to maintain those systems effectively. However, currently only a small number of maintainability prediction models are available for object-oriented systems. This paper presents a Bayesian network maintainability prediction model for an object-oriented software system. The model is constructed using object-oriented metric data in Li and Henry's datasets, which were collected from two different object-oriented systems. Prediction accuracy of the model is evaluated and compared with commonly used regression-based models. The results suggest that the Bayesian network model can predict maintainability more accurately than the regression-based models for one system, and almost as accurately as the best regression-based model for the other system.

Journal ArticleDOI
TL;DR: Several Bayesian multivariate spatial models are considered for estimating the crash rates from different kinds of crashes and a general theorem for each case is proved to ensure posterior propriety under noninformative priors.

Journal ArticleDOI
TL;DR: This paper reviews Bayesian methods that have been developed in recent years to estimate and evaluate dynamic stochastic general equilibrium (DSGE) models and applies these methods to data generated from correctly specified and misspecified linearized DSGE models.
Abstract: This paper reviews Bayesian methods that have been developed in recent years to estimate and evaluate dynamic stochastic general equilibrium (DSGE) models. We consider the estimation of linearized DSGE models, the evaluation of models based on Bayesian model checking, posterior odds comparisons, and comparisons to vector autoregressions, as well as the nonlinear estimation based on a second-order accurate model solution. These methods are applied to data generated from correctly specified and misspecified linearized DSGE models, and a DSGE model that was solved with a second-order perturbation method.

01 Jan 2006
TL;DR: The hierarchical Bayesian optimization algorithm (hBOA) as discussed by the authors solves nearly decomposable and hierarchical optimization problems scalably by combining concepts from evolutionary computation, machine learning and statistics.
Abstract: The hierarchical Bayesian optimization algorithm (hBOA) solves nearly decomposable and hierarchical optimization problems scalably by combining concepts from evolutionary computation, machine learning and statistics. Since many complex real-world systems are nearly decomposable and hierarchical, hBOA is expected to provide scalable solutions for many complex real-world problems. This chapter describes hBOA and its predecessor, the Bayesian optimization algorithm (BOA), and outlines some of the most important theoretical and empirical results in this line of research.

Dissertation
Malte Kuß1
07 Apr 2006
TL;DR: Gaussian process models constitute a class of probabilistic statistical models in which a Gaussian process is used to describe the Bayesian a priori uncertainty about a latent function, and it will be shown how this can be used to estimate value functions.
Abstract: Gaussian process models constitute a class of probabilistic statistical models in which a Gaussian process (GP) is used to describe the Bayesian a priori uncertainty about a latent function. After a brief introduction of Bayesian analysis, Chapter 3 describes the general construction of GP models with the conjugate model for regression as a special case (OHagan 1978). Furthermore, it will be discussed how GP can be interpreted as priors over functions and what beliefs are implicitly represented by this. The conceptual clearness of the Bayesian approach is often in contrast with the practical difficulties that result from its analytically intractable computations. Therefore approximation techniques are of central importance for applied Bayesian analysis. Chapter 4 describes Laplace's method, the Expectation Propagation approximation, and Markov chain Monte Carlo sampling for approximate inference in GP models. The most common and successful application of GP models is in regression problems where the noise is assumed to be homoscedastic and distributed according to a normal distribution. In practical data analysis this assumption is often inappropriate and inference is sensitive to the occurrence of more extreme errors (so called outliers). Chapter 5 proposes several variants of GP models for robust regression and describes how Bayesian inference can be approximated in each. Experiments on several data sets are presented in which the proposed models are compared with respect to their predictive performance and practical applicability. Gaussian process priors can also be used to define flexible, probabilistic classification models. Again, exact Bayesian inference is analytically intractable and various approximation techniques have been proposed, but no clear picture has yet emerged, as to when and why which algorithm should be preferred. Chapter 6 presents a detailed examination of the model, focusing on the question which approximation technique is most appropriate by investigating the structure of the posterior distribution. An experimental study is presented which corroborates the theoretical insights. Reinforcement learning deals with the problem of how an agent can optimise its behaviour in a sequential decision process such that its utility over time is maximised. Chapter 7 addresses applications of GPs for model-based reinforcement learning in continuous domains. If the environment's response to the agent's actions can be predicted using GP regression models, probabilistic planning and an approximate policy iteration algorithm can be implemented. A core concept in reinforcement learning is the value function, which describes the long-term strategic value of a state. Using GP models we are able to solve an approximate continuous equivalent of the Bellman equations, and it will be shown how this can be used to estimate value functions.

01 Jan 2006
TL;DR: This study introduces a novel approach based on recent developments in the estimation of genetic population structure that combines analytical integration with stochastic optimization to identify stock mixtures.
Abstract: Molecular markers have been demonstrated to be useful for the estimation of stock mixture proportions where the origin of individuals is determined from baseline samples. Bayesian statistical methods are widely recognized as providing a preferable strategy for such analyses. In general, Bayesian estimation is based on standard latent class models using data augmentation through Markov chain Monte Carlo techniques. In this study, we introduce a novel approach based on recent developments in the estimation of genetic population structure. Our strategy combines analytical integration with stochastic optimization to identify stock mixtures. An important enhancement over previous methods is the possibility of appropriately handling data where only partial baseline sample information is available. We address the potential use of nonmolecular, auxiliary biological information in our Bayesian model.

Journal ArticleDOI
01 Apr 2006
TL;DR: An information fusion framework based on dynamic Bayesian networks is proposed to provide active, dynamic, purposive and sufficing information fusion in order to arrive at a reliable conclusion with reasonable time and limited resources.
Abstract: Many information fusion applications are often characterized by a high degree of complexity because: 1) data are often acquired from sensors of different modalities and with different degrees of uncertainty; 2) decisions must be made efficiently; and 3) the world situation evolves over time. To address these issues, we propose an information fusion framework based on dynamic Bayesian networks to provide active, dynamic, purposive and sufficing information fusion in order to arrive at a reliable conclusion with reasonable time and limited resources. The proposed framework is suited to applications where the decision must be made efficiently from dynamically available information of diverse and disparate sources.

Journal ArticleDOI
TL;DR: Experiments show that the assisted cognition information technology system is able to accurately extract and label places, predict the goals of a person, and recognize situations in which the user makes mistakes, such as taking a wrong bus.
Abstract: In this article we discuss an assisted cognition information technology system that can learn personal maps customized for each user and infer his daily activities and movements from raw GPS data. The system uses discriminative and generative models for different parts of this task. A discriminative relational Markov network is used to extract significant places and label them; a generative dynamic Bayesian network is used to learn transportation routines, and infer goals and potential user errors at real time. We focus on the basic structures of the models and briefly discuss the inference and learning techniques. Experiments show that our system is able to accurately extract and label places, predict the goals of a person, and recognize situations in which the user makes mistakes, such as taking a wrong bus.

Book
01 Jun 2006
TL;DR: In this paper, a hierarchical Bayesian spatio-temporal model for population spread is proposed to predict the distribution of extremes in tropical trees in the future in the presence of hurricane damage.
Abstract: Preface PART I. INTRODUCTION TO HIERARCHICAL MODELING 1. Elements of hierarchical Bayesian influence 2. Bayesian hierarchical models in geographical genetics PART II. HIERARCHICAL MODELS IN EXPERIMENTAL SETTINGS 3. Synthesizing ecological experiments and observational data with hierarchical Bayes 4. Effects of global change on inflorescence production: a Bayesian hierarchical analysis PART III. SPATIAL MODELING 5. Building statistical models to analyse species distributions 6. Implications of vulnerability to hurricane damage for long-term survival of tropical tree species: a Bayesian hierarchical analysis PART IV. SPATIO-TEMPORAL MODELING 7. Spatial temporal statistical modeling and prediction of environmental processes 8. Hierarchical Bayesian spatio-temporal models for population spread 9. Spatial models for the distribution of extremes References Index

Journal ArticleDOI
TL;DR: In this article, the authors extend the literature on Bayesian model comparison for ordinary least-squares regression models to include spatial autoregressive and spatial error models, and compare models that consist of different matrices of explanatory variables.
Abstract: We extend the literature on Bayesian model comparison for ordinary least-squares regression models to include spatial autoregressive and spatial error models. Our focus is on comparing models that consist of different matrices of explanatory variables. A Markov Chain Monte Carlo model composition methodology labelled MC to the third by Madigan and York (1995) is developed for two types of spatial econometric models that are frequently used in the literature. The methodology deals with cases where the number of possible models based on different combinations of candidate explanatory variables is large enough that calculation of posterior probabilities for all models is difficult or infeasible. Estimates and inferences are produced by averaging over models using the posterior model probabilities as weights, a procedure known as Bayesian model averaging. We illustrate the methods using a spatial econometric model of origin-destination population migration flows between the 48 US States and District of Columbia during the 1990 to 2000 period.

Journal ArticleDOI
TL;DR: A way of obtaining a predictive distribution from recursive claims reserving models, including the well known model introduced by Mack (1993) is described, since it can be used with data sets which exhibit negative incremental amounts.
Abstract: This paper extends the methods introduced in England & Verrall (2002), and shows how predictive distributions of outstanding liabilities in general insurance can be obtained using bootstrap or Bayesian techniques for clearly defined statistical models. A general procedure for bootstrapping is described, by extending the methods introduced in England & Verrall (1999), England (2002) and Pinheiro et al. (2003). The analogous Bayesian estimation procedure is implemented using Markov-chain Monte Carlo methods, where the models are constructed as Bayesian generalised linear models using the approach described by Dellaportas & Smith (1993). In particular, this paper describes a way of obtaining a predictive distribution from recursive claims reserving models, including the well known model introduced by Mack (1993). Mack's model is useful, since it can be used with data sets which exhibit negative incremental amounts. The techniques are illustrated with examples, and the resulting predictive distributions from both the bootstrap and Bayesian methods are compared.

Journal ArticleDOI
TL;DR: The back-propagation of target values to attention allows the model to show trial-order effects, including highlighting and differences in magnitude of forward and backward blocking, which have been challenging for Bayesian learning models.
Abstract: A scheme is described for locally Bayesian parameter updating in models structured as successions of component functions. The essential idea is to back-propagate the target data to interior modules, such that an interior component's target is the input to the next component that maximizes the probability of the next component's target. Each layer then does locally Bayesian learning. The approach assumes online trial-by-trial learning. The resulting parameter updating is not globally Bayesian but can better capture human behavior. The approach is implemented for an associative learning model that first maps inputs to attentionally filtered inputs and then maps attentionally filtered inputs to outputs. The Bayesian updating allows the associative model to exhibit retrospective revaluation effects such as backward blocking and unovershadowing, which have been challenging for associative learning models. The back-propagation of target values to attention allows the model to show trial-order effects, including highlighting and differences in magnitude of forward and backward blocking, which have been challenging for Bayesian learning models.

Journal ArticleDOI
TL;DR: The mathematical background and procedure for developing equivalent Bayesian networks for given discrete functions provided in this paper can be applied to other discrete functions to develop probabilistic models.

Journal ArticleDOI
TL;DR: The ability of the system to first learn useful relationships among parameters, and then to use them to constrain the training of the Bayesian network is demonstrated, resulting in improved cross-validated accuracy of the learned model.
Abstract: The task of learning models for many real-world problems requires incorporating domain knowledge into learning algorithms, to enable accurate learning from a realistic volume of training data. This paper considers a variety of types of domain knowledge for constraining parameter estimates when learning Bayesian networks. In particular, we consider domain knowledge that constrains the values or relationships among subsets of parameters in a Bayesian network with known structure. We incorporate a wide variety of parameter constraints into learning procedures for Bayesian networks, by formulating this task as a constrained optimization problem. The assumptions made in module networks, dynamic Bayes nets and context specific independence models can be viewed as particular cases of such parameter constraints. We present closed form solutions or fast iterative algorithms for estimating parameters subject to several specific classes of parameter constraints, including equalities and inequalities among parameters, constraints on individual parameters, and constraints on sums and ratios of parameters, for discrete and continuous variables. Our methods cover learning from both frequentist and Bayesian points of view, from both complete and incomplete data. We present formal guarantees for our estimators, as well as methods for automatically learning useful parameter constraints from data. To validate our approach, we apply it to the domain of fMRI brain image analysis. Here we demonstrate the ability of our system to first learn useful relationships among parameters, and then to use them to constrain the training of the Bayesian network, resulting in improved cross-validated accuracy of the learned model. Experiments on synthetic data are also presented.

Book ChapterDOI
01 Jan 2006
TL;DR: This chapter presents the principles of Bayesian forecasting, and describes recent advances in computational capabilities for applying them that have dramatically expanded the scope of applicability of the Bayesian approach.
Abstract: Bayesian forecasting is a natural product of a Bayesian approach to inference. The Bayesian approach in general requires explicit formulation of a model, and conditioning on known quantities, in order to draw inferences about unknown ones. In Bayesian forecasting, one simply takes a subset of the unknown quantities to be future values of some variables of interest. This chapter presents the principles of Bayesian forecasting, and describes recent advances in computational capabilities for applying them that have dramatically expanded the scope of applicability of the Bayesian approach. It describes historical developments and the analytic compromises that were necessary prior to recent developments, the application of the new procedures in a variety of examples, and reports on two long-term Bayesian forecasting exercises.

Journal ArticleDOI
TL;DR: In this article, efficient importance sampling (EIS) is used to perform a classical and Bayesian analysis of univariate and multivariate stochastic volatility (SV) models for financial return series.
Abstract: In this paper, efficient importance sampling (EIS) is used to perform a classical and Bayesian analysis of univariate and multivariate stochastic volatility (SV) models for financial return series. EIS provides a highly generic and very accurate procedure for the Monte Carlo (MC) evaluation of high-dimensional interdependent integrals. It can be used to carry out ML-estimation of SV models as well as simulation smoothing where the latent volatilities are sampled at once. Based on this EIS simulation smoother, a Bayesian Markov chain Monte Carlo (MCMC) posterior analysis of the parameters of SV models can be performed.

Journal ArticleDOI
TL;DR: A naïve Bayesian classifier is implemented which models continuous numerical data using a Gaussian distribution and it is demonstrated that this enhanced performance, upon comparison with other implementations, is independent of the descriptor sets chosen.
Abstract: We have implemented a naive Bayesian classifier which models continuous numerical data using a Gaussian distribution. Several cases of interest in the area of absorption, distribution, metabolism, and excretion prediction are presented which demonstrate that this approach is superior to the implementation of naive Bayesian classifiers in which continuous chemical descriptors are modeled as binary data. We demonstrate that this enhanced performance, upon comparison with other implementations, is independent of the descriptor sets chosen. We also compare the performance of three implementations of naive Bayesian classifiers with other previously described models.

Journal ArticleDOI
TL;DR: An overview of the Bayesian approach to operational risk is provided, before expanding on the current literature through consideration of general families of non-conjugate severity distributions, g-and-h and GB2 distributions and techniques for parameter estimation for general severity and frequency distribution models from a Bayesian perspective.
Abstract: Operational risk is an important quantitative topic as a result of the Basel II regulatory requirements. Operational risk models need to incorporate internal and external loss data observations in combination with expert opinion surveyed from business specialists. Following the Loss Distributional Approach, this article considers three aspects of the Bayesian approach to the modelling of operational risk. Firstly we provide an overview of the Bayesian approach to operational risk, before expanding on the current literature through consideration of general families of non-conjugate severity distributions, g-and-h and GB2 distributions. Bayesian model selection is presented as an alternative to popular frequentist tests, such as Kolmogorov-Smirnov or Anderson-Darling. We present a number of examples and develop techniques for parameter estimation for general severity and frequency distribution models from a Bayesian perspective. Finally we introduce and evaluate recently developed stochastic sampling techniques and highlight their application to operational risk through the models developed.