scispace - formally typeset
Search or ask a question

Showing papers on "Posterior probability published in 2010"


Proceedings ArticleDOI
23 Aug 2010
TL;DR: It is shown that both problems can be overcome by replacing the conventional point estimate of accuracy by an estimate of the posterior distribution of the balanced accuracy.
Abstract: Evaluating the performance of a classification algorithm critically requires a measure of the degree to which unseen examples have been identified with their correct class labels. In practice, generalizability is frequently estimated by averaging the accuracies obtained on individual cross-validation folds. This procedure, however, is problematic in two ways. First, it does not allow for the derivation of meaningful confidence intervals. Second, it leads to an optimistic estimate when a biased classifier is tested on an imbalanced dataset. We show that both problems can be overcome by replacing the conventional point estimate of accuracy by an estimate of the posterior distribution of the balanced accuracy.

1,216 citations


Posted Content
TL;DR: Pplacer enables efficient phylogenetic placement and subsequent visualization, making likelihood-based phylogenetics methodology practical for large collections of reads; it is freely available as source code, binaries, and a web service.
Abstract: Likelihood-based phylogenetic inference is generally considered to be the most reliable classification method for unknown sequences. However, traditional likelihood-based phylogenetic methods cannot be applied to large volumes of short reads from next-generation sequencing due to computational complexity issues and lack of phylogenetic signal. "Phylogenetic placement," where a reference tree is fixed and the unknown query sequences are placed onto the tree via a reference alignment, is a way to bring the inferential power of likelihood-based approaches to large data sets. This paper introduces pplacer, a software package for phylogenetic placement and subsequent visualization. The algorithm can place twenty thousand short reads on a reference tree of one thousand taxa per hour per processor, has essentially linear time and memory complexity in the number of reference taxa, and is easy to run in parallel. Pplacer features calculation of the posterior probability of a placement on an edge, which is a statistically rigorous way of quantifying uncertainty on an edge-by-edge basis. It also can inform the user of the positional uncertainty for query sequences by calculating expected distance between placement locations, which is crucial in the estimation of uncertainty with a well-sampled reference tree. The software provides visualizations using branch thickness and color to represent number of placements and their uncertainty. A simulation study using reads generated from 631 COG alignments shows a high level of accuracy for phylogenetic placement over a wide range of alignment diversity, and the power of edge uncertainty estimates to measure placement confidence. Pplacer enables efficient phylogenetic placement and subsequent visualization, making likelihood-based phylogenetics methodology practical for large collections of reads; it is available as source code, binaries, and a web service.

746 citations


Journal ArticleDOI
TL;DR: This paper shows that formulating the problem in a naive Bayesian classification framework makes such preprocessing unnecessary and produces an algorithm that is simple, efficient, and robust, and it scales well as the number of classes grows.
Abstract: While feature point recognition is a key component of modern approaches to object detection, existing approaches require computationally expensive patch preprocessing to handle perspective distortion. In this paper, we show that formulating the problem in a naive Bayesian classification framework makes such preprocessing unnecessary and produces an algorithm that is simple, efficient, and robust. Furthermore, it scales well as the number of classes grows. To recognize the patches surrounding keypoints, our classifier uses hundreds of simple binary features and models class posterior probabilities. We make the problem computationally tractable by assuming independence between arbitrary sets of features. Even though this is not strictly true, we demonstrate that our classifier nevertheless performs remarkably well on image data sets containing very significant perspective changes.

726 citations


Journal ArticleDOI
TL;DR: This article makes an all-to-all gapless structural match on 6684 non-homologous single-domain proteins in the PDB and finds that the TM-scores follow an extreme value distribution, and examines the posterior probability of the same fold proteins from three datasets SCOP, CATH and the consensus of SCOP and CATH to indicate that TM-score can be used as an approximate but quantitative criterion for protein topology classification.
Abstract: Motivation: Protein structure similarity is often measured by root mean squared deviation, global distance test score and template modeling score (TM-score). However, the scores themselves cannot provide information on how significant the structural similarity is. Also, it lacks a quantitative relation between the scores and conventional fold classifications. This article aims to answer two questions: (i) what is the statistical significance of TM-score? (ii) What is the probability of two proteins having the same fold given a specific TM-score? Results: We first made an all-to-all gapless structural match on 6684 non-homologous single-domain proteins in the PDB and found that the TM-scores follow an extreme value distribution. The data allow us to assign each TM-score a P-value that measures the chance of two randomly selected proteins obtaining an equal or higher TM-score. With a TM-score at 0.5, for instance, its P-value is 5.5 × 10-7, which means we need to consider at least 1.8 million random protein pairs to acquire a TM-score of no less than 0.5. Second, we examine the posterior probability of the same fold proteins from three datasets SCOP, CATH and the consensus of SCOP and CATH. It is found that the posterior probability from different datasets has a similar rapid phase transition around TM-score=0.5. This finding indicates that TM-score can be used as an approximate but quantitative criterion for protein topology classification, i.e. protein pairs with a TM-score >0.5 are mostly in the same fold while those with a TM-score <0.5 are mainly not in the same fold. Contact: zhng@umich.edu Supplementary information: Supplementary data are available at Bioinformatics online.

687 citations


Journal ArticleDOI
TL;DR: The nested Chinese restaurant process (nCRP) as discussed by the authors is a stochastic process that assigns probability distributions to ensembles of infinitely deep, infinitely branching trees, and it can be used as a prior distribution in a Bayesian nonparametric model of document collections.
Abstract: We present the nested Chinese restaurant process (nCRP), a stochastic process that assigns probability distributions to ensembles of infinitely deep, infinitely branching trees. We show how this stochastic process can be used as a prior distribution in a Bayesian nonparametric model of document collections. Specifically, we present an application to information retrieval in which documents are modeled as paths down a random tree, and the preferential attachment dynamics of the nCRP leads to clustering of documents according to sharing of topics at multiple levels of abstraction. Given a corpus of documents, a posterior inference algorithm finds an approximation to a posterior distribution over trees, topics and allocations of words to levels of the tree. We demonstrate this algorithm on collections of scientific abstracts from several journals. This model exemplifies a recent trend in statistical machine learning—the use of Bayesian nonparametric methods to infer distributions on flexible data structures.

613 citations


Journal ArticleDOI
TL;DR: PyMC as discussed by the authors is a Python package that allows users to efficiently code a probabilistic model and draw samples from its posterior distribution using Markov chain Monte Carlo techniques using a user guide.
Abstract: This user guide describes a Python package, PyMC, that allows users to efficiently code a probabilistic model and draw samples from its posterior distribution using Markov chain Monte Carlo techniques.

602 citations


Journal ArticleDOI
TL;DR: This application of Bayes' Theorem automatically applies a quantitative Ockham's razor that penalizes the data‐fit of more complex model classes that extract more information from the data.
Abstract: Probability logic with Bayesian updating provides a rigorous framework to quantify modeling uncertainty and perform system identification. It uses probability as a multi-valued propositional logic for plausible reasoning where the probability of a model is a measure of its relative plausibility within a set of models. System identification is thus viewed as inference about plausible system models and not as a quixotic quest for the true model. Instead of using system data to estimate the model parameters, Bayes' Theorem is used to update the relative plausibility of each model in a model class, which is a set of input–output probability models for the system and a probability distribution over this set that expresses the initial plausibility of each model. Robust predictive analyses informed by the system data use the entire model class with the probabilistic predictions of each model being weighed by its posterior probability. Additional robustness to modeling uncertainty comes from combining the robust predictions of each model class in a set of candidates for the system, where each contribution is weighed by the posterior probability of the model class. This application of Bayes' Theorem automatically applies a quantitative Ockham's razor that penalizes the data-fit of more complex model classes that extract more information from the data. Robust analyses involve integrals over parameter spaces that usually must be evaluated numerically by Laplace's method of asymptotic approximation or by Markov Chain Monte Carlo methods. An illustrative application is given using synthetic data corresponding to a structural health monitoring benchmark structure.

497 citations


Journal ArticleDOI
Steven L. Scott1
TL;DR: A heuristic for managing multi-armed bandits called randomized probability matching is described, which randomly allocates observations to arms according the Bayesian posterior probability that each arm is optimal.
Abstract: A multi-armed bandit is an experiment with the goal of accumulating rewards from a payoff distribution with unknown parameters that are to be learned sequentially. This article describes a heuristic for managing multi-armed bandits called randomized probability matching, which randomly allocates observations to arms according the Bayesian posterior probability that each arm is optimal. Advances in Bayesian computation have made randomized probability matching easy to apply to virtually any payoff distribution. This flexibility frees the experimenter to work with payoff distributions that correspond to certain classical experimental designs that have the potential to outperform methods that are ‘optimal’ in simpler contexts. I summarize the relationships between randomized probability matching and several related heuristics that have been used in the reinforcement learning literature. Copyright © 2010 John Wiley & Sons, Ltd.

451 citations


Journal ArticleDOI
TL;DR: A new method for relaxing the assumption of a strict molecular clock using Markov chain Monte Carlo to implement Bayesian modeling averaging over random local molecular clocks is presented, suggesting that large sequence datasets may only require a small number of local molecular clock models to reconcile their branch lengths with a time scale.
Abstract: Relaxed molecular clock models allow divergence time dating and "relaxed phylogenetic" inference, in which a time tree is estimated in the face of unequal rates across lineages. We present a new method for relaxing the assumption of a strict molecular clock using Markov chain Monte Carlo to implement Bayesian modeling averaging over random local molecular clocks. The new method approaches the problem of rate variation among lineages by proposing a series of local molecular clocks, each extending over a subregion of the full phylogeny. Each branch in a phylogeny (subtending a clade) is a possible location for a change of rate from one local clock to a new one. Thus, including both the global molecular clock and the unconstrained model results, there are a total of 22n-2 possible rate models available for averaging with 1, 2, ..., 2n - 2 different rate categories. We propose an efficient method to sample this model space while simultaneously estimating the phylogeny. The new method conveniently allows a direct test of the strict molecular clock, in which one rate rules them all, against a large array of alternative local molecular clock models. We illustrate the method's utility on three example data sets involving mammal, primate and influenza evolution. Finally, we explore methods to visualize the complex posterior distribution that results from inference under such models. The examples suggest that large sequence datasets may only require a small number of local molecular clocks to reconcile their branch lengths with a time scale. All of the analyses described here are implemented in the open access software package BEAST 1.5.4 ( http://beast-mcmc.googlecode.com/ ).

412 citations


Journal ArticleDOI
TL;DR: A multi-object filter suitable for image observations with low signal-to-noise ratio (SNR) is developed and a particle implementation of the multi- object filter is proposed and demonstrated via simulations.
Abstract: The problem of jointly detecting multiple objects and estimating their states from image observations is formulated in a Bayesian framework by modeling the collection of states as a random finite set. Analytic characterizations of the posterior distribution of this random finite set are derived for various prior distributions under the assumption that the regions of the observation influenced by individual objects do not overlap. These results provide tractable means to jointly estimate the number of states and their values from image observations. As an application, we develop a multi-object filter suitable for image observations with low signal-to-noise ratio (SNR). A particle implementation of the multi-object filter is proposed and demonstrated via simulations.

364 citations


Journal ArticleDOI
TL;DR: This paper develops a set of methods enabling an information-theoretic distributed control architecture to facilitate search by a mobile sensor network that captures effects in more general scenarios that are not possible with linearized methods.
Abstract: This paper develops a set of methods enabling an information-theoretic distributed control architecture to facilitate search by a mobile sensor network. Given a particular configuration of sensors, this technique exploits the structure of the probability distributions of the target state and of the sensor measurements to control the mobile sensors such that future observations minimize the expected future uncertainty of the target state. The mutual information between the sensors and the target state is computed using a particle filter representation of the posterior probability distribution, making it possible to directly use nonlinear and non-Gaussian target state and sensor models. To make the approach scalable to increasing network sizes, single-node and pairwise-node approximations to the mutual information are derived for general probability density models, with analytically bounded error. The pairwise-node approximation is proven to be a more accurate objective function than the single-node approximation. The mobile sensors are cooperatively controlled using a distributed optimization, yielding coordinated motion of the network. These methods are explored for various sensing modalities, including bearings-only sensing, range-only sensing, and magnetic field sensing, all with potential for search and rescue applications. For each sensing modality, the behavior of this non-parametric method is compared and contrasted with the results of linearized methods, and simulations are performed of a target search using the dynamics of actual vehicles. Monte Carlo results demonstrate that as network size increases, the sensors more quickly localize the target, and the pairwise-node approximation provides superior performance to the single-node approximation. The proposed methods are shown to produce similar results to linearized methods in particular scenarios, yet they capture effects in more general scenarios that are not possible with linearized methods.

Journal ArticleDOI
TL;DR: The main findings of the study are that the parameter posterior distributions generated by the Bayesian method are slightly less scattered than those by the GLUE method, and GLUE is sensitive to the threshold value used to select behavioral parameter sets resulting in a wider uncertainty interval of the posterior distribution of parameters, and a wider confidence interval of model uncertainty.

Posted Content
TL;DR: In this article, the authors consider general, heterogeneous, and arbitrarily covariant two-dimensional uncertainties, and situations in which there are bad data (large outliers), unknown uncertainties and unknown but expected intrinsic scatter in the linear relationship being fit, and emphasize the importance of having a generative model for the data.
Abstract: We go through the many considerations involved in fitting a model to data, using as an example the fit of a straight line to a set of points in a two-dimensional plane Standard weighted least-squares fitting is only appropriate when there is a dimension along which the data points have negligible uncertainties, and another along which all the uncertainties can be described by Gaussians of known variance; these conditions are rarely met in practice We consider cases of general, heterogeneous, and arbitrarily covariant two-dimensional uncertainties, and situations in which there are bad data (large outliers), unknown uncertainties, and unknown but expected intrinsic scatter in the linear relationship being fit Above all we emphasize the importance of having a "generative model" for the data, even an approximate one Once there is a generative model, the subsequent fitting is non-arbitrary because the model permits direct computation of the likelihood of the parameters or the posterior probability distribution Construction of a posterior probability distribution is indispensible if there are "nuisance parameters" to marginalize away

Journal ArticleDOI
TL;DR: In this article, the bias and variance of the standard estimators of the posterior distribution are derived, which are based on rejection sampling and linear adjustment, and an original estimator is introduced based on quadratic adjustment and its bias contains a fewer number of terms than the estimator with linear adjustment.
Abstract: Approximate Bayesian Computation is a family of likelihood-free inference techniques that are well suited to models defined in terms of a stochastic generating mechanism. In a nutshell, Approximate Bayesian Computation proceeds by computing summary statistics sobs from the data and simulating summary statistics for different values of the parameter Θ. The posterior distribution is then approximated by an estimator of the conditional density g(Θ|sobs). In this paper, we derive the asymptotic bias and variance of the standard estimators of the posterior distribution which are based on rejection sampling and linear adjustment. Additionally, we introduce an original estimator of the posterior distribution based on quadratic adjustment and we show that its bias contains a fewer number of terms than the estimator with linear adjustment. Although we find that the estimators with adjustment are not universally superior to the estimator based on rejection sampling, we find that they can achieve better performance ...

Journal ArticleDOI
TL;DR: It is demonstrated that inaccurate branch-length estimates result from either poor mixing of MCMC chains or posterior distributions with excessive weight at long tree lengths, and a formula is provided to calculate an exponential rate parameter for the branch- length prior that should eliminate inference of biased branch lengths in many cases.
Abstract: A surprising number of recent Bayesian phylogenetic analyses contain branch-length estimates that are several orders of magnitude longer than corresponding maximum-likelihood estimates. The levels of divergence implied by such branch lengths are unreasonable for studies using biological data and are known to be false for studies using simulated data. We conducted additional Bayesian analyses and studied approximate-posterior surfaces to investigate the causes underlying these large errors. We manipulated the starting parameter values of the Markov chain Monte Carlo (MCMC) analyses, the moves used by the MCMC analyses, and the prior-probability distribution on branch lengths. We demonstrate that inaccurate branch-length estimates result from either 1) poor mixing of MCMC chains or 2) posterior distributions with excessive weight at long tree lengths. Both effects are caused by a rapid increase in the volume of branch-length space as branches become longer. In the former case, both an MCMC move that scales all branch lengths in the tree simultaneously and the use of overdispersed starting branch lengths allow the chain to accurately sample the posterior distribution and should be used in Bayesian analyses of phylogeny. In the latter case, branch-length priors can have strong effects on resulting inferences and should be carefully chosen to reflect biological expectations. We provide a formula to calculate an exponential rate parameter for the branch-length prior that should eliminate inference of biased branch lengths in many cases. In any phylogenetic analysis, the biological plausibility of branch-length output must be carefully considered.

Journal ArticleDOI
TL;DR: In this article, a hierarchical probabilistic method for performing the relevant meta-analysis, that is, inferring the true eccentricity distribution, taking as input the likelihood functions for the individual star eccentricities, or samplings of the posterior probability distributions for the eccentricities (under a given, uninformative prior).
Abstract: Standard maximum-likelihood estimators for binary-star and exoplanet eccentricities are biased high, in the sense that the estimated eccentricity tends to be larger than the true eccentricity. As with most non-trivial observables, a simple histogram of estimated eccentricities is not a good estimate of the true eccentricity distribution. Here, we develop and test a hierarchical probabilistic method for performing the relevant meta-analysis, that is, inferring the true eccentricity distribution, taking as input the likelihood functions for the individual star eccentricities, or samplings of the posterior probability distributions for the eccentricities (under a given, uninformative prior). The method is a simple implementation of a hierarchical Bayesian model; it can also be seen as a kind of heteroscedastic deconvolution. It can be applied to any quantity measured with finite precision—other orbital parameters, or indeed any astronomical measurements of any kind, including magnitudes, distances, or photometric redshifts—so long as the measurements have been communicated as a likelihood function or a posterior sampling.

Proceedings Article
06 Dec 2010
TL;DR: This paper uses nested stick-breaking processes to allow for trees of unbounded width and depth, where data can live at any node and are infinitely exchangeable, and applies the method to hierarchical clustering of images and topic modeling of text data.
Abstract: Many data are naturally modeled by an unobserved hierarchical structure. In this paper we propose a flexible nonparametric prior over unknown data hierarchies. The approach uses nested stick-breaking processes to allow for trees of unbounded width and depth, where data can live at any node and are infinitely exchangeable. One can view our model as providing infinite mixtures where the components have a dependency structure corresponding to an evolutionary diffusion down a tree. By using a stick-breaking approach, we can apply Markov chain Monte Carlo methods based on slice sampling to perform Bayesian inference and simulate from the posterior distribution on trees. We apply our method to hierarchical clustering of images and topic modeling of text data.

Journal ArticleDOI
TL;DR: It was found that the optimal set of summary statistics was highly dataset specific, suggesting that more generally there may be no globally-optimal choice, which argues for a new selection for each dataset even if the model and target of inference are unchanged.
Abstract: How best to summarize large and complex datasets is a problem that arises in many areas of science. We approach it from the point of view of seeking data summaries that minimize the average squared error of the posterior distribution for a parameter of interest under approximate Bayesian computation (ABC). In ABC, simulation under the model replaces computation of the likelihood, which is convenient for many complex models. Simulated and observed datasets are usually compared using summary statistics, typically in practice chosen on the basis of the investigator's intuition and established practice in the field. We propose two algorithms for automated choice of efficient data summaries. Firstly, we motivate minimisation of the estimated entropy of the posterior approximation as a heuristic for the selection of summary statistics. Secondly, we propose a two-stage procedure: the minimum-entropy algorithm is used to identify simulated datasets close to that observed, and these are each successively regarded as observed datasets for which the mean root integrated squared error of the ABC posterior approximation is minimized over sets of summary statistics. In a simulation study, we both singly and jointly inferred the scaled mutation and recombination parameters from a population sample of DNA sequences. The computationally-fast minimum entropy algorithm showed a modest improvement over existing methods while our two-stage procedure showed substantial and highly-significant further improvement for both univariate and bivariate inferences. We found that the optimal set of summary statistics was highly dataset specific, suggesting that more generally there may be no globally-optimal choice, which argues for a new selection for each dataset even if the model and target of inference are unchanged.

Journal ArticleDOI
TL;DR: In this paper, the impact of threshold values and number of sample simulations on the uncertainty assessment of GLUE is systematically evaluated, and a comprehensive evaluation about the posterior distribution, parameter and total uncertainty estimated by GLUE and a formal Bayesian approach using the Metropolis Hasting (MH) algorithm are performed for two well tested conceptual hydrological models (WASMOD and DTVGM) in an arid basin from North China.

Journal ArticleDOI
TL;DR: In this article, an incremental mixture importance sampling (IMIS) algorithm is proposed, which iteratively builds up a better sampling function, which retains the simplicity and transparency of sampling importance resampling, but is much more efficient computationally.
Abstract: The Joint United Nations Programme on HIV/AIDS (UNAIDS) has decided to use Bayesian melding as the basis for its probabilistic projections of HIV prevalence in countries with generalized epidemics. This combines a mechanistic epidemiological model, prevalence data, and expert opinion. Initially, the posterior distribution was approximated by sampling-importance-resampling, which is simple to implement, easy to interpret, transparent to users, and gave acceptable results for most countries. For some countries, however, this is not computationally efficient because the posterior distribution tends to be concentrated around nonlinear ridges and can also be multimodal. We propose instead incremental mixture importance sampling (IMIS), which iteratively builds up a better importance sampling function. This retains the simplicity and transparency of sampling importance resampling, but is much more efficient computationally. It also leads to a simple estimator of the integrated likelihood that is the basis for Bayesian model comparison and model averaging. In simulation experiments and on real data, it outperformed both sampling importance resampling and three publicly available generic Markov chain Monte Carlo algorithms for this kind of problem.

Journal ArticleDOI
TL;DR: In this article, a reversible jump Markov Chain Monte Carlo (MCMC) algorithm was proposed to solve the inverse problem of inferring 1-D subsurface elastic properties from teleseismic receiver function data.
Abstract: SUMMARY A key question in the analysis of an inverse problem is the quantification of the non-uniqueness of the solution. Non-uniqueness arises when properties of an earth model can be varied without significantly worsening the fit to observed data. In most geophysical inverse problems, subsurface properties are parameterized using a fixed number of unknowns, and non-uniqueness has been tackled with a Bayesian approach by determining a posterior probability distribution in the parameter space that combines ‘a priori’ information with information contained in the observed data. However, less consideration has been given to the question whether the data themselves can constrain the model complexity, that is the number of unknowns needed to fit the observations. Answering this question requires solving a trans-dimensional inverse problem, where the number of unknowns is an unknown itself. Recently, the Bayesian approach to parameter estimation has been extended to quantify the posterior probability of the model complexity (the number of model parameters) with a quantity called ‘evidence’. The evidence can be hard to estimate in a non-linear problem; a practical solution is to use a Monte Carlo sampling algorithm that samples models with different number of unknowns in proportion to their posterior probability. This study presents a method to solve in trans-dimensional fashion the non-linear inverse problem of inferring 1-D subsurface elastic properties from teleseismic receiver function data. The Earth parameterization consists of a variable number of horizontal layers, where little is assumed a priori about the elastic properties, the number of layers, and and their thicknesses. We developed a reversible jump Markov Chain Monte Carlo algorithm that draws samples from the posterior distribution of Earth models. The solution of the inverse problem is a posterior probability distribution of the number of layers, their thicknesses and the elastic properties as a function of depth. These posterior distributions quantify completely the non-uniqueness of the solution. We illustrate the algorithm by inverting synthetic and field measurements, and the results show that the data constrain the model complexity. In the synthetic example, the main features of the subsurface properties are recovered in the posterior probability distribution. The inversion results for actual measurements show a crustal structure that agrees with previous studies in both crustal thickness and presence of intracrustal low-velocity layers.

Journal ArticleDOI
TL;DR: A utility function based on mutual information is used and three intuitive interpretations of the utility function in terms of Bayesian posterior estimates are given and offered as a proof of concept to an experiment on memory retention.
Abstract: Discriminating among competing statistical models is a pressing issue for many experimentalists in the field of cognitive science. Resolving this issue begins with designing maximally informative experiments. To this end, the problem to be solved in adaptive design optimization is identifying experimental designs under which one can infer the underlying model in the fewest possible steps. When the models under consideration are nonlinear, as is often the case in cognitive science, this problem can be impossible to solve analytically without simplifying assumptions. However, as we show in this letter, a full solution can be found numerically with the help of a Bayesian computational trick derived from the statistics literature, which recasts the problem as a probability density simulation in which the optimal design is the mode of the density. We use a utility function based on mutual information and give three intuitive interpretations of the utility function in terms of Bayesian posterior estimates. As a proof of concept, we offer a simple example application to an experiment on memory retention.

Journal ArticleDOI
TL;DR: The authors used Markov Chain Monte Carlo techniques to approximate the posterior distribution of model parameters, and posterior predictive model checking to assess model fits and search for periodogram outliers that may represent periodic signals.
Abstract: Many astrophysical sources, especially compact accreting sources, show strong, random brightness fluctuations with broad power spectra in addition to periodic or quasi-periodic oscillations (QPOs) that have narrower spectra. The random nature of the dominant source of variance greatly complicates the process of searching for possible weak periodic signals. We have addressed this problem using the tools of Bayesian statistics; in particular, using Markov Chain Monte Carlo techniques to approximate the posterior distribution of model parameters, and posterior predictive model checking to assess model fits and search for periodogram outliers that may represent periodic signals. The methods developed are applied to two example data sets, both long XMM–Newton observations of highly variable Seyfert 1 galaxies: RE J1034 + 396 and Mrk 766. In both cases, a bend (or break) in the power spectrum is evident. In the case of RE J1034 + 396, the previously reported QPO is found but with somewhat weaker statistical significance than reported in previous analyses. The difference is due partly to the improved continuum modelling, better treatment of nuisance parameters and partly to different data selection methods.

Journal ArticleDOI
TL;DR: In the DSGE model of Smets and Wouters (2007), for example, which involves a 36-dimensional posterior distribution, it is shown that the autocorrelations of the sampled draws from the TaRB-MH algorithm decay to zero within 30-40 lags for most parameters.

Journal ArticleDOI
TL;DR: A general method for calculating the evidence for each model class based on the system data, which requires the evaluation of a multi‐dimensional integral involving the product of the likelihood and prior defined by the model class.
Abstract: In recent years, Bayesian model updating techniques based on dynamic data have been applied in system identification and structural health monitoring. Because of modeling uncertainty, a set of competing candidate model classes may be available to represent a system and it is then desirable to assess the plausibility of each model class based on system data. Bayesian model class assessment may then be used, which is based on the posterior probability of the different candidates for representing the system. If more than one model class has significant posterior probability, then Bayesian model class averaging provides a coherent mechanism to incorporate all of these model classes in making probabilistic predictions for the system response. This Bayesian model assessment and averaging requires calculation of the evidence for each model class based on the system data, which requires the evaluation of a multi-dimensional integral involving the product of the likelihood and prior defined by the model class. In this article, a general method for calculating the evidence is proposed based on using posterior samples from any Markov Chain Monte Carlo algorithm. The effectiveness of the proposed method is illustrated by Bayesian model updating and assessment using simulated earthquake data from a ten-story nonclassically damped building responding linearly and a four-story building responding inelastically.

Journal ArticleDOI
TL;DR: In this article, the authors used a Bayesian approach to estimate the age-mass-extinction distribution of star clusters in the universe. But their work was restricted to the solar metallicity.
Abstract: Star clusters are studied widely both as benchmarks for stellar evolution models and in their own right. Cluster age and mass distributions within galaxies are probes of star formation histories, and of cluster formation and disruption processes. The vast majority of clusters in the Universe is small, and it is well known that the integrated fluxes and colors have broad probability distributions, due to small numbers of bright stars. This paper goes beyond the description of predicted probability distributions, and presents results of the analysis of cluster energy distributions in an explicitly stochastic context. The method developed is Bayesian. It provides posterior probability distributions in the age-mass-extinction space, using multi-wavelength photometric observations and a large collection of Monte-Carlo simulations of clusters of finite stellar masses. Both UBVI and UBVIK datasets are considered, and the study conducted in this paper is restricted to the solar metallicity. We first reassess and explain errors arising from the use of standard analysis methods, which are based on continuous population synthesis models: systematic errors on ages and random errors on masses are large, while systematic errors on masses tend to be smaller. The age-mass distributions obtained after analysis of a synthetic sample are very similar to those found for real galaxies in the literature. The Bayesian approach on the other hand, is very successful in recovering the input ages and masses. Taking stochastic effects into account is important, more important for instance than the choice of adding or removing near-IR data in many cases. We found no immediately obvious reason to reject priors inspired by previous (standard) analyses of cluster populations in galaxies, i.e. cluster distributions that scale with mass as M^-2 and are uniform on a logarithmic age scale.

Journal ArticleDOI
TL;DR: This model suggests a biophysical instantiation of the Bayesian decision rule, while predicting important deviations from it similar to the 'base-rate neglect' observed in human studies when alternatives have unequal prior probabilities.
Abstract: We propose that synapses may be the workhorse of the neuronal computations that underlie probabilistic reasoning. We built a neural circuit model for probabilistic inference in which information provided by different sensory cues must be integrated and the predictive powers of individual cues about an outcome are deduced through experience. We found that bounded synapses naturally compute, through reward-dependent plasticity, the posterior probability that a choice alternative is correct given that a cue is presented. Furthermore, a decision circuit endowed with such synapses makes choices on the basis of the summed log posterior odds and performs near-optimal cue combination. The model was validated by reproducing salient observations of, and provides insights into, a monkey experiment using a categorization task. Our model thus suggests a biophysical instantiation of the Bayesian decision rule, while predicting important deviations from it similar to the 'base-rate neglect' observed in human studies when alternatives have unequal prior probabilities.

Journal ArticleDOI
TL;DR: In this paper, a general framework for functional analysis of variance (ANOVA) models from a Bayesian viewpoint is developed, assigning Gaussian process prior distributions to each batch of functional functions.
Abstract: Functional analysis of variance (ANOVA) models partition a func- tional response according to the main efiects and interactions of various factors. This article develops a general framework for functional ANOVA modeling from a Bayesian viewpoint, assigning Gaussian process prior distributions to each batch of functional efiects. We discuss the choices to be made in specifying such a model, advocating the treatment of levels within a given factor as dependent but exchangeable quantities, and we suggest weakly informative prior distributions for higher level parameters that may be appropriate in many situations. We discuss computationally e-cient strategies for posterior sampling using Markov Chain Monte Carlo algorithms, and we emphasize useful graphical summaries based on the posterior distribution of model-based analogues of traditional ANOVA decom- positions of variance. We illustrate this process of model speciflcation, posterior sampling, and graphical posterior summaries in two examples. The flrst consid- ers the efiect of geographic region on the temperature proflles at weather stations in Canada. The second example examines sources of variability in the output of regional climate models from a designed experiment.

Journal ArticleDOI
TL;DR: Simulation results show that with significantly less computation, the PCRLB based iterative sensor selection method achieves similar mean squared error (MSE) performance as compared to the state-of-the-art mutual information based sensors selection method.
Abstract: In this paper, the source localization problem in wireless sensor networks is investigated where the location of the source is estimated based on the quantized measurements received from sensors in the field. An energy efficient iterative source localization scheme is proposed where the algorithm begins with a coarse location estimate obtained from measurement data from a set of anchor sensors. Based on the available data at each iteration, the posterior probability density function (pdf) of the source location is approximated using an importance sampling based Monte Carlo method and this information is utilized to activate a number of non-anchor sensors. Two sensor selection metrics namely the mutual information and the posterior Cramer-Rao lower bound (PCRLB) are employed and their performance compared. Further, the approximate posterior pdf of the source location is used to compress the quantized data of each activated sensor using distributed data compression techniques. Simulation results show that with significantly less computation, the PCRLB based iterative sensor selection method achieves similar mean squared error (MSE) performance as compared to the state-of-the-art mutual information based sensor selection method. By selecting only the most informative sensors and compressing their data prior to transmission to the fusion center, the iterative source localization method reduces the communication requirements significantly and thereby results in energy savings.

Journal ArticleDOI
TL;DR: This paper investigates the use of a one-class support vector machine algorithm to detect the onset of system anomalies, and trend output classification probabilities, as a way to monitor the health of a system.
Abstract: This paper investigates the use of a one-class support vector machine algorithm to detect the onset of system anomalies, and trend output classification probabilities, as a way to monitor the health of a system. In the absence of “unhealthy” (negative class) information, a marginal kernel density estimate of the “healthy” (positive class) distribution is used to construct an estimate of the negative class. The output of the one-class support vector classifier is calibrated to posterior probabilities by fitting a logistic distribution to the support vector predictor model in an effort to manage false alarms.