scispace - formally typeset
Search or ask a question

Showing papers on "Bayes' theorem published in 2002"


Journal ArticleDOI
TL;DR: A series of models that exemplify the diversity of problems that can be addressed within the empirical Bayesian framework are presented, using PET data to show how priors can be derived from the between-voxel distribution of activations over the brain.

744 citations


Journal ArticleDOI
TL;DR: Two inferential approaches to this problem are discussed: an empirical Bayes method that requires very little a priori Bayesian modeling, and the frequentist method of “false discovery rates” proposed by Benjamini and Hochberg in 1995.
Abstract: In a classic two-sample problem, one might use Wilcoxon's statistic to test for a difference between treatment and control subjects. The analogous microarray experiment yields thousands of Wilcoxon statistics, one for each gene on the array, and confronts the statistician with a difficult simultaneous inference situation. We will discuss two inferential approaches to this problem: an empirical Bayes method that requires very little a priori Bayesian modeling, and the frequentist method of "false discovery rates" proposed by Benjamini and Hochberg in 1995. It turns out that the two methods are closely related and can be used together to produce sensible simultaneous inferences.

687 citations


Journal ArticleDOI
TL;DR: A new Monte Carlo approach that can accurately and rapidly infer haplotypes for a large number of linked SNPs and is robust to the violation of Hardy-Weinberg equilibrium, to the presence of missing data, and to occurrences of recombination hotspots is proposed.
Abstract: Haplotypes have gained increasing attention in the mapping of complex-disease genes, because of the abundance of single-nucleotide polymorphisms (SNPs) and the limited power of conventional single-locus analyses. It has been shown that haplotype-inference methods such as Clark's algorithm, the expectation-maximization algorithm, and a coalescence-based iterative-sampling algorithm are fairly effective and economical alternatives to molecular-haplotyping methods. To contend with some weaknesses of the existing algorithms, we propose a new Monte Carlo approach. In particular, we first partition the whole haplotype into smaller segments. Then, we use the Gibbs sampler both to construct the partial haplotypes of each segment and to assemble all the segments together. Our algorithm can accurately and rapidly infer haplotypes for a large number of linked SNPs. By using a wide variety of real and simulated data sets, we demonstrate the advantages of our Bayesian algorithm, and we show that it is robust to the violation of Hardy-Weinberg equilibrium, to the presence of missing data, and to occurrences of recombination hotspots.

679 citations


Journal ArticleDOI
TL;DR: The procedures used in conventional data analysis are formulated in terms of hierarchical linear models and a connection between classical inference and parametric empirical Bayes (PEB) through covariance component estimation is established through covariances component estimation.

647 citations


Journal ArticleDOI
TL;DR: Simulation studies show that this estimator compares well with maximum likelihood estimators (i.e., empirical Bayes estimators from the Bayesian viewpoint) for which an iterative numerical procedure is needed and may be infeasible.
Abstract: Consider a stochastic abundance model in which the species arrive in the sample according to independent Poisson processes, where the abundance parameters of the processes follow a gamma distribution. We propose a new estimator of the number of species for this model. The estimator takes the form of the number of duplicated species (i.e., species represented by two or more individuals) divided by an estimated duplication fraction. The duplication fraction is estimated from all frequencies including singleton information. The new estimator is closely related to the sample coverage estimator presented by Chao and Lee (1992, Journal of the American Statistical Association 87, 210-217). We illustrate the procedure using the Malayan butterfly data discussed by Fisher, Corbet, and Williams (1943, Journal of Animal Ecology 12, 42-58) and a 1989 Christmas Bird Count dataset collected in Florida, U.S.A. Simulation studies show that this estimator compares well with maximum likelihood estimators (i.e., empirical Bayes estimators from the Bayesian viewpoint) for which an iterative numerical procedure is needed and may be infeasible.

455 citations


Journal ArticleDOI
TL;DR: In this article, the performance of Bayes prediction of amino acids under positive selection by computer simulation was evaluated, and it was shown that using a large number of lineages is the best way to improve the accuracy and power.
Abstract: Bayes prediction quantifies uncertainty by assigning posterior probabilities. It was used to identify amino acids in a protein under recurrent diversifying selection indicated by higher nonsynonymous (d(N)) than synonymous (d(S)) substitution rates or by omega = d(N)/d(S) > 1. Parameters were estimated by maximum likelihood under a codon substitution model that assumed several classes of sites with different omega ratios. The Bayes theorem was used to calculate the posterior probabilities of each site falling into these site classes. Here, we evaluate the performance of Bayes prediction of amino acids under positive selection by computer simulation. We measured the accuracy by the proportion of predicted sites that were truly under selection and the power by the proportion of true positively selected sites that were predicted by the method. The accuracy was slightly better for longer sequences, whereas the power was largely unaffected by the increase in sequence length. Both accuracy and power were higher for medium or highly diverged sequences than for similar sequences. We found that accuracy and power were unacceptably low when data contained only a few highly similar sequences. However, sampling a large number of lineages improved the performance substantially. Even for very similar sequences, accuracy and power can be high if over 100 taxa are used in the analysis. We make the following recommendations: (1) prediction of positive selection sites is not feasible for a few closely related sequences; (2) using a large number of lineages is the best way to improve the accuracy and power of the prediction; and (3) multiple models of heterogeneous selective pressures among sites should be applied in real data analysis.

366 citations


Journal ArticleDOI
TL;DR: In this article, the authors present an overview of multilevel or hierarchical data modelling and its applications in medicine, including a description of the basic model for nested data is given and it is shown how this can be extended to fit flexible models for repeated measures data and more complex structures involving cross-classifications and multiple membership patterns.
Abstract: This tutorial presents an overview of multilevel or hierarchical data modelling and its applications in medicine. A description of the basic model for nested data is given and it is shown how this can be extended to fit flexible models for repeated measures data and more complex structures involving cross-classifications and multiple membership patterns within the software package MLwiN. A variety of response types are covered and both frequentist and Bayesian estimation methods are described.

361 citations


Journal ArticleDOI
TL;DR: The method is demonstrated using an input-state-output model of the hemodynamic coupling between experimentally designed causes or factors in fMRI studies and the ensuing BOLD response, and extends classical inference to more plausible inferences about the parameters of the model given the data.

315 citations


Journal ArticleDOI
TL;DR: It is shown that the asymptotic target of SVMs are some interesting classification functions that are directly related to the Bayes rule, and helps understand the success of SVM in many classification studies, and makes it easier to compare SVMs and traditional statistical methods.
Abstract: The Bayes rule is the optimal classification rule if the underlying distribution of the data is known. In practice we do not know the underlying distribution, and need to “learn” classification rules from the data. One way to derive classification rules in practice is to implement the Bayes rule approximately by estimating an appropriate classification function. Traditional statistical methods use estimated log odds ratio as the classification function. Support vector machines (SVMs) are one type of large margin classifier, and the relationship between SVMs and the Bayes rule was not clear. In this paper, it is shown that the asymptotic target of SVMs are some interesting classification functions that are directly related to the Bayes rule. The rate of convergence of the solutions of SVMs to their corresponding target functions is explicitly established in the case of SVMs with quadratic or higher order loss functions and spline kernels. Simulations are given to illustrate the relation between SVMs and the Bayes rule in other cases. This helps understand the success of SVMs in many classification studies, and makes it easier to compare SVMs and traditional statistical methods.

263 citations


Journal ArticleDOI
TL;DR: It is shown that the risk score, defined as the probability of disease given data on multiple markers, is the optimal function in the sense that the receiver operating characteristic (ROC) curve is maximized at every point.
Abstract: The development of biomarkers for cancer screening is an active area of research. While several biomarkers exist, none is sufficiently sensitive and specific on its own for population screening. It is likely that successful screening programs will require combinations of multiple markers. We consider how to combine multiple disease markers for optimal performance of a screening program. We show that the risk score, defined as the probability of disease given data on multiple markers, is the optimal function in the sense that the receiver operating characteristic (ROC) curve is maximized at every point. Arguments draw on the Neyman-Pearson lemma. This contrasts with the corresponding optimality result of classic decision theory, which is set in a Bayesian framework and is based on minimizing an expected loss function associated with decision errors. Ours is an optimality result defined from a strictly frequentist point of view and does not rely on the notion of associating costs with misclassifications. The implication for data analysis is that binary regression methods can be used to yield appropriate relative weightings of different biomarkers, at least in large samples. We propose some modifications to standard binary regression methods for application to the disease screening problem. A flexible biologically motivated simulation model for cancer biomarkers is presented and we evaluate our methods by application to it. An application to real data concerning two ovarian cancer biomarkers is also presented. Our results are equally relevant to the more general medical diagnostic testing problem, where results of multiple tests or predictors are combined to yield a composite diagnostic test. Moreover, our methods justify the development of clinical prediction scores based on binary regression.

254 citations


Journal ArticleDOI
TL;DR: This paper demonstrates how the fully Bayesian approach to meta-analysis of binary outcome data, considered on an absolute risk or relative risk scale, can be extended to perform analyses on both the absolute and relative risk scales.
Abstract: When conducting a meta-analysis of clinical trials with binary outcomes, a normal approximation for the summary treatment effect measure in each trial is inappropriate in the common situation where some of the trials in the meta-analysis are small, or the observed risks are close to 0 or 1. This problem can be avoided by making direct use of the binomial distribution within trials. A fully Bayesian method has already been developed for random effects meta-analysis on the log-odds scale using the BUGS implementation of Gibbs sampling. In this paper we demonstrate how this method can be extended to perform analyses on both the absolute and relative risk scales. Within each approach we exemplify how trial-level covariates, including underlying risk, can be considered. Data from 46 trials of the effect of single-dose ibuprofen on post-operative pain are analysed and the results contrasted with those derived from classical and Bayesian summary statistic methods. The clinical interpretation of the odds ratio scale is not straightforward. The advantages and flexibility of a fully Bayesian approach to meta-analysis of binary outcome data, considered on an absolute risk or relative risk scale, are now available.

Journal ArticleDOI
01 Dec 2002-Genetics
TL;DR: Analysis of sequence data from three species under a finite-site nucleotide substitution model suggests that typical data sets contain useful information about the ancestral population sizes and that it is advantageous to analyze data of several species simultaneously.
Abstract: Polymorphisms in an ancestral population can cause conflicts between gene trees and the species tree. Such conflicts can be used to estimate ancestral population sizes when data from multiple loci are available. In this article I extend previous work for estimating ancestral population sizes to analyze sequence data from three species under a finite-site nucleotide substitution model. Both maximum-likelihood (ML) and Bayes methods are implemented for joint estimation of the two speciation dates and the two population size parameters. Both methods account for uncertainties in the gene tree due to few informative sites at each locus and make an efficient use of information in the data. The Bayes algorithm using Markov chain Monte Carlo (MCMC) enjoys a computational advantage over ML and also provides a framework for incorporating prior information about the parameters. The methods are applied to a data set of 53 nuclear noncoding contigs from human, chimpanzee, and gorilla published by Chen and Li. Estimates of the effective population size for the common ancestor of humans and chimpanzees by both ML and Bayes methods are approximately 12,000-21,000, comparable to estimates for modern humans, and do not support the notion of a dramatic size reduction in early human populations. Estimates published previously from the same data are several times larger and appear to be biased due to methodological deficiency. The divergence between humans and chimpanzees is dated at approximately 5.2 million years ago and the gorilla divergence 1.1-1.7 million years earlier. The analysis suggests that typical data sets contain useful information about the ancestral population sizes and that it is advantageous to analyze data of several species simultaneously.

Journal ArticleDOI
TL;DR: In this paper, a comparison of probabilities of liquefaction calculated with two different probabilistic approaches, logistic regression and Bayesian mapping, is made, and it is shown that the Bayesian map-based approach is preferred over the Logistic Regression approach for estimating the site-specific probability of liquidation, although both methods yield comparable probabilities.
Abstract: This paper presents an assessment of existing and new probabilistic methods for liquefaction potential evaluation. Emphasis is placed on comparison of probabilities of liquefaction calculated with two different approaches, logistic regression and Bayesian mapping. Logistic regression is a well-established statistical procedure, whereas Bayesian mapping is a relatively new application of the Bayes’ theorem to the evaluation of soil liquefaction. In the present study, simplified procedures for soil liquefaction evaluation, including the Seed–Idriss, Robertson–Wride, and Andrus–Stokoe methods, based on the standard penetration test, cone penetration test, and shear wave velocity measurement, respectively, are used as the basis for developing Bayesian mapping functions. The present study shows that the Bayesian mapping approach is preferred over the logistic regression approach for estimating the site-specific probability of liquefaction, although both methods yield comparable probabilities. The paper also co...

Journal ArticleDOI
TL;DR: This work proposes an approach using cross-validation predictive densities to obtain expected utility estimates and Bayesian bootstrap to obtain samples from their distributions, and discusses the probabilistic assumptions made and properties of two practical cross- validate methods, importance sampling and k-fold cross- validation.
Abstract: In this work, we discuss practical methods for the assessment, comparison, and selection of complex hierarchical Bayesian models. A natural way to assess the goodness of the model is to estimate its future predictive capability by estimating expected utilities. Instead of just making a point estimate, it is important to obtain the distribution of the expected utility estimate because it describes the uncertainty in the estimate. The distributions of the expected utility estimates can also be used to compare models, for example, by computing the probability of one model having a better expected utility than some other model. We propose an approach using cross-validation predictive densities to obtain expected utility estimates and Bayesian bootstrap to obtain samples from their distributions. We also discuss the probabilistic assumptions made and properties of two practical cross-validation methods, importance sampling and k-fold cross-validation. As illustrative examples, we use multilayer perceptron neural networks and gaussian processes with Markov chain Monte Carlo sampling in one toy problem and two challenging real-world problems.

Journal ArticleDOI
TL;DR: This general approach was used to analyze a larger data set consisting of the 18S ribosomal RNA gene of 39 metazoan species, and obtained date estimates consistent with paleontological records.
Abstract: The molecular clock, i.e., constancy of the rate of evolution over time, is commonly assumed in estimating divergence dates. However, this assumption is often violated and has drastic effects on date estimation. Recently, a number of attempts have been made to relax the clock assumption. One approach is to use maximum likelihood, which assigns rates to branches and allows the estimation of both rates and times. An alternative is the Bayes approach, which models the change of the rate over time. A number of models of rate change have been proposed. We have extended and evaluated models of rate evolution, i.e., the lognormal and its recent variant, along with the gamma, the exponential, and the Ornstein-Uhlenbeck processes. These models were first applied to a small hominoid data set, where an empirical Bayes approach was used to estimate the hyperparameters that measure the amount of rate variation. Estimation of divergence times was sensitive to these hyperparameters, especially when the assumed model is close to the clock assumption. The rate and date estimates varied little from model to model, although the posterior Bayes factor indicated the Ornstein-Uhlenbeck process outperformed the other models. To demonstrate the importance of allowing for rate change across lineages, this general approach was used to analyze a larger data set consisting of the 18S ribosomal RNA gene of 39 metazoan species. We obtained date estimates consistent with paleontological records, the deepest split within the group being about 560 million years ago. Estimates of the rates were in accordance with the Cambrian explosion hypothesis and suggested some more recent lineage-specific bursts of evolution.

Journal ArticleDOI
TL;DR: This work describes the Bayesian phylogenetic method which uses a Markov chain Monte Carlo algorithm to provide samples from the posterior distribution of tree topologies and discusses some issues arising when using Bayesian techniques on RNA sequence data.
Abstract: We study the phylogeny of the placental mammals using molecular data from all mitochondrial tRNAs and rRNAs of 54 species We use probabilistic substitution models specific to evolution in base paired regions of RNA A number of these models have been implemented in a new phylogenetic inference software package for carrying out maximum likelihood and Bayesian phylogenetic inferences We describe our Bayesian phylogenetic method which uses a Markov chain Monte Carlo algorithm to provide samples from the posterior distribution of tree topologies Our results show support for four primary mammalian clades, in agreement with recent studies of much larger data sets mainly comprising nuclear DNA We discuss some issues arising when using Bayesian techniques on RNA sequence data

Journal ArticleDOI
TL;DR: A variational Bayes (VB) learning algorithm for generalized autoregressive (GAR) models that reduces to the Bayesian evidence framework for Gaussian noise and uninformative priors and weight precisions and is applied to synthetic and real data with encouraging results.
Abstract: We describe a variational Bayes (VB) learning algorithm for generalized autoregressive (GAR) models. The noise is modeled as a mixture of Gaussians rather than the usual single Gaussian. This allows different data points to be associated with different noise levels and effectively provides robust estimation of AR coefficients. The VB framework is used to prevent overfitting and provides model-order selection criteria both for AR order and noise model order. We show that for the special case of Gaussian noise and uninformative priors on the noise and weight precisions, the VB framework reduces to the Bayesian evidence framework. The algorithm is applied to synthetic and real data with encouraging results.

Journal ArticleDOI
TL;DR: In this article, a fast algorithm for updating regressions in the Markov chain Monte Carlo searches for posterior inference is developed, allowing many more variables than observations to be considered, which can greatly aid the interpretation of the model.
Abstract: When a number of distinct models contend for use in prediction, the choice of a single model can offer rather unstable predictions. In regression, stochastic search variable selection with Bayesian model averaging offers a cure for this robustness issue but at the expense of requiring very many predictors. Here we look at Bayes model averaging incorporating variable selection for prediction. This offers similar mean-square errors of prediction but with a vastly reduced predictor space. This can greatly aid the interpretation of the model. It also reduces the cost if measured variables have costs. The development here uses decision theory in the context of the multivariate general linear model. In passing, this reduced predictor space Bayes model averaging is contrasted with single-model approximations. A fast algorithm for updating regressions in the Markov chain Monte Carlo searches for posterior inference is developed, allowing many more variables than observations to be contemplated. We discuss the merits of absolute rather than proportionate shrinkage in regression, especially when there are more variables than observations. The methodology is illustrated on a set of spectroscopic data used for measuring the amounts of different sugars in an aqueous solution.

Journal ArticleDOI
TL;DR: The basic frameworks and techniques of the Bayesian approach to image restoration are reviewed from the statistical-mechanical point of view and a few basic notions in digital image processing are explained to convince the reader that statistical mechanics has a close formal similarity to this problem.
Abstract: The basic frameworks and techniques of the Bayesian approach to image restoration are reviewed from the statistical-mechanical point of view. First, a few basic notions in digital image processing are explained to convince the reader that statistical mechanics has a close formal similarity to this problem. Second, the basic formulation of the statistical estimation from the observed degraded image by using the Bayes formula is demonstrated. The relationship between Bayesian statistics and statistical mechanics is also listed. Particularly, it is explained that some correlation inequalities on the Nishimori line of the random spin model also play an important role in Bayesian image restoration. Third, the framework of Bayesian image restoration for binary images by means of the Ising model is reviewed. Some practical algorithms for binary image restoration are given by employing the mean-field and the Bethe approximations. Finally, Bayesian image restoration for a grey-level image using the Gaussian model is reviewed, and the Gaussian model is extended to a more practical probabilistic model by introducing the line state to treat the effects of edges. The line state is also extended to quantized values.

Journal ArticleDOI
TL;DR: In this paper, the authors present an axiomatization of the rule which requires updating of all the priors by Bayes rule and show that when all priors give positive probability to an event E, a certain coherence property between conditional and unconditional preferences is satisfied if and only if the set of subjective probability measures considered by the agent given E is obtained by updating all subjective prior probability measures using Bayes rules.
Abstract: When preferences are such that there is no unique additive prior, the issue of which updating rule to use is of extreme importance. This paper presents an axiomatization of the rule which requires updating of all the priors by Bayes rule. The decision maker has conditional preferences over acts. It is assumed that preferences over acts conditional on event E happening, do not depend on lotteries received on E c, obey axioms which lead to maxmin expected utility representation with multiple priors, and have common induced preferences over lotteries. The paper shows that when all priors give positive probability to an event E, a certain coherence property between conditional and unconditional preferences is satisfied if and only if the set of subjective probability measures considered by the agent given E is obtained by updating all subjective prior probability measures using Bayes rule.

Journal ArticleDOI
TL;DR: The proposed probabilistic neural networks approach combines both Bayes theorem of conditional probability and Parzen's method for estimating the probability density functions of the random variables for classification of bacterial growth/no-growth data and modeling the probability of growth.

Journal ArticleDOI
TL;DR: It is argued that neither a fixed effect nor a random effects analysis is appropriate when the mega-trial is included and how Bayesian meta-analysis models allow appropriate exploration of hypotheses that the treatment effect depends on the size of the trial or the risk in the control group.
Abstract: Background There has been extensive discussion of the apparent conflict between metaanalyses and a mega-trial investigating the benefits of intravenous magnesium following myocardial infarction, in which the early trial results have been said to be ‘too good to be true’. Methods We apply Bayesian methods of meta-analysis to the trials available before and after the publication of the ISIS-4 results. We show how scepticism can be formally incorporated into an analysis as a Bayesian prior distribution, and how Bayesian meta-analysis models allow appropriate exploration of hypotheses that the treatment effect depends on the size of the trial or the risk in the control group. Results Adoption of a sceptical prior would have led early enthusiasm for magnesium to be suitably tempered, but only if combined with a random effects meta-analysis, rather than the fixed effect analysis that was actually conducted. Conclusions We argue that neither a fixed effect nor a random effects analysis is appropriate when the mega-trial is included. The Bayesian framework provides many possibilities for flexible exploration of clinical hypotheses, but there can be considerable sensitivity to apparently innocuous assumptions.

Journal ArticleDOI
TL;DR: In this article, an approximation based on the Laplace approximation method developed by Tierney and Kadane [1] and a bivariate prior density for the two unknown parameters, suggested by Al-Hussaini and Jaheen [2] are used for obtaining the Bayes estimates.
Abstract: Maximum likelihood and Bayes estimates for the two parameters and the reliability function of the Burr Type XII distribution are obtained based on progressive Type II censored samples. An approximation based on the Laplace approximation method developed by Tierney and Kadane [1] and a bivariate prior density for the two unknown parameters, suggested by Al-Hussaini and Jaheen [2] are used for obtaining the Bayes estimates. These estimates are compared via Monte Carlo simulation study.

Journal ArticleDOI
TL;DR: Rule-based systems for the prediction of the occurrence of disease can be evaluated in a number of different ways, and Bayes's Theorem can be a useful tool to examine how a disease forecast affects the probability of occurrence.
Abstract: Rule-based systems for the prediction of the occurrence of disease can be evaluated in a number of different ways. One way is to examine the probability of disease occurrence before and after using the predictor. Bayes's Theorem can be a useful tool to examine how a disease forecast (either positive or negative) affects the probability of occurrence, and simple analyses can be conducted without knowing the risk preferences of the targeted decision makers. Likelihood ratios can be calculated from the sensitivity and specificity of the forecast, and provide convenient summaries of the forecast performance. They can also be used in a simpler form of Bayes's Theorem. For diseases where little or no prior information on occurrence is available, most forecasts will be useful in that they will increase or decrease the probability of disease occurrence. For extremely common or extremely rare diseases, likelihood ratios may not be sufficiently large or small to substantially affect the probability of disease occurrence or make any difference to the actions taken by the decision maker.

Journal ArticleDOI
TL;DR: A formal framework for analysing how the statistics of natural stimuli and the process of natural selection interact to determine the design of perceptual systems is proposed and it is suggested that the Bayesian approach is appropriate not only for the study of perceptual Systems but also for theStudy of many other systems in biology.
Abstract: In recent years, there has been much interest in characterizing statistical properties of natural stimuli in order to better understand the design of perceptual systems. A fruitful approach has been to compare the processing of natural stimuli in real perceptual systems with that of ideal observers derived within the framework of Bayesian statistical decision theory. While this form of optimization theory has provided a deeper understanding of the information contained in natural stimuli as well as of the computational principles employed in perceptual systems, it does not directly consider the process of natural selection, which is ultimately responsible for design. Here we propose a formal framework for analysing how the statistics of natural stimuli and the process of natural selection interact to determine the design of perceptual systems. The framework consists of two complementary components. The first is a maximum fitness ideal observer, a standard Bayesian ideal observer with a utility function appropriate for natural selection. The second component is a formal version of natural selection based upon Bayesian statistical decision theory. Maximum fitness ideal observers and Bayesian natural selection are demonstrated in several examples. We suggest that the Bayesian approach is appropriate not only for the study of perceptual systems but also for the study of many other systems in biology.

Journal ArticleDOI
12 Sep 2002
TL;DR: The analysis gives the best invariant and indeed minimax procedure for predictive density estimation by directly verifying extended Bayes properties or by general aspects of decision theory on groups which are shown to simplify in the case of Kullback-Leibler loss.
Abstract: For location and scale families of distributions and related settings of linear regression, we determine minimax procedures for predictive density estimation, for universal data compression, and for the minimum description length (MDL) criterion for model selection The analysis gives the best invariant and indeed minimax procedure for predictive density estimation by directly verifying extended Bayes properties or, alternatively, by general aspects of decision theory on groups which are shown to simplify in the case of Kullback-Leibler loss An exact minimax rule is generalized Bayes using a uniform (Lebesgue measure) prior on the location and log-scale parameters, which is made proper by conditioning on an initial set of observations

Proceedings Article
07 Aug 2002
TL;DR: This paper shows that active learning methods are often suboptimal and presents a tractable method for incorporating knowledge of the budget in the information acquisition process, and compares methods for sequentially choosing which feature value to purchase next.
Abstract: There is almost always a cost associated with acquiring training data. We consider the situation where the learner, with a fixed budget, may 'purchase' data during training. In particular, we examine the case where observing the value of a feature of a training example has an associated cost, and the total cost of all feature values acquired during training must remain less than this fixed budget. This paper compares methods for sequentially choosing which feature value to purchase next, given the budget and user's current knowledge of Naive Bayes model parameters. Whereas active learning has traditionally focused on myopic (greedy) approaches and uniform/round-robin policies for query selection, this paper shows that such methods are often suboptimal and presents a tractable method for incorporating knowledge of the budget in the information acquisition process.

Patent
22 May 2002
TL;DR: In this article, a method and apparatus that provides an expert system for determining respiratory phase during ventilatory support of a subject is described. Butler et al. used Bayes' theorem to determine phase probabilities for each state using a prior probability function and observed probability function.
Abstract: A method and apparatus that provides an expert system for determining respiratory phase during ventilatory support of a subject. Discrete phase states are partitioned and prior probability functions and observed probability functions for each state are defined. The probability functions are based upon relative duration of each state as well as the flow characteristics of each state. These functions are combined to determine phase probabilities for each state using Bayes' theorem. The calculated probabilities for the states may then be compared to determine which state the subject is experiencing. A ventilator may then conform respiratory support in accordance with the most probable phase. To provide a learning feature, the probability functions may be adjusted during use to provide a more subject specific response that accounts for changing respiratory characteristics.

Journal ArticleDOI
TL;DR: In this article, an approach for studying the sensitivity of the Bayes factor to the prior distributions for the parameters in the models being compared is presented. But this approach is only useful for nested models and it has a graphical flavor making it more attractive than other common approaches to sensitivity analysis for Bayes factors.
Abstract: The Bayes factoris a Bayesian statistician's tool for model selection. Bayes factors can be highly sensitive to the prior distributions used for the parameters of the models under consideration. We discuss an approach for studying the sensitivity of the Bayes factor to the prior distributions for the parameters in the models being compared. The approach is found to be extremely useful for nested models; it has a graphical flavor making it more attractive than other common approaches to sensitivity analysis for Bayes factors.

Journal ArticleDOI
TL;DR: This paper proposes some simple Bayesian networks for standard analysis of patterns of inference concerning scientific evidence, with a discussion of the rationale behind the nets, the corresponding probabilistic formulas, and the required probability assessments.