scispace - formally typeset
Search or ask a question

Showing papers on "Bayes' theorem published in 1998"


Journal ArticleDOI
TL;DR: The problem of updating a structural model and its associated uncertainties by utilizing dynamic response data is addressed using a Bayesian statistical framework that can handle the inherent ill-conditioning and possible nonuniqueness in model updating applications.
Abstract: The problem of updating a structural model and its associated uncertainties by utilizing dynamic response data is addressed using a Bayesian statistical framework that can handle the inherent ill-conditioning and possible nonuniqueness in model updating applications. The objective is not only to give more accurate response predictions for prescribed dynamic loadings but also to provide a quantitative assessment of this accuracy. In the methodology presented, the updated (optimal) models within a chosen class of structural models are the most probable based on the structural data if all the models are equally plausible a priori. The prediction accuracy of the optimal structural models is given by also updating probability models for the prediction error. The precision of the parameter estimates of the optimal structural models, as well as the precision of the optimal prediction-error parameters, can be examined. A large-sample asymptotic expression is given for the updated predictive probability distribution of the uncertain structural response, which is a weighted average of the prediction probability distributions for each optimal model. This predictive distribution can be used to make model predictions despite possible nonuniqueness in the optimal models.

1,235 citations


Journal ArticleDOI
Brani Vidakovic1
TL;DR: A wavelet shrinkage by coherent Bayesian inference in the wavelet domain is proposed and the methods are tested on standard Donoho-Johnstone test functions.
Abstract: Wavelet shrinkage, the method proposed by the seminal work of Donoho and Johnstone is a disarmingly simple and efficient way of denoising data. Shrinking wavelet coefficients was proposed from several optimality criteria. In this article a wavelet shrinkage by coherent Bayesian inference in the wavelet domain is proposed. The methods are tested on standard Donoho-Johnstone test functions.

320 citations


Journal ArticleDOI
TL;DR: It is shown that the posterior probabilities derived from Bayes theorem are part of the Analytic Hierarchy Process (AHP) framework, and hence thatBayes theorem is a sufficient condition of a solution in the sense of the AHP.
Abstract: Judgments are needed in medical diagnosis to determine what tests to perform given certain symptoms. For many diseases, what information to gather on symptoms and what combination of symptoms lead to a given disease are not well known. Even when the number of symptoms is small, the required number of experiments to generate adequate statistical data can be unmanageably large. There is need in diagnosis for an integrative model that incorporates both statistical data and expert judgment. When statistical data are present but no expert judgment is available, one property of this model should be to reproduce results obtained through time honored procedures such as Bayes theorem. When expert judgment is also present, it should be possible to combine judgment with statistical data to identify the disease that best describes the observed symptoms. Here we are interested in the Analytic Hierarchy Process (AHP) framework that deals with dependence among the elements or clusters of a decision structure to combine statistical and judgmental information. It is shown that the posterior probabilities derived from Bayes theorem are part of this framework, and hence that Bayes theorem is a sufficient condition of a solution in the sense of the AHP. An illustration is given as to how a purely judgment-based model in the AHP can be used in medical diagnosis. The application of the model to a case study demonstrates that both statistics and judgment can be combined to provide diagnostic support to medical practitioner colleagues with whom we have interacted in doing this work.

288 citations


Journal ArticleDOI
TL;DR: This paper reviews statistical methods developed to estimate the sensitivity and specificity of screening or diagnostic tests when the fallible tests are not evaluated against a gold standard.
Abstract: This paper reviews statistical methods developed to estimate the sensitivity and specificity of screening or diagnostic tests when the fallible tests are not evaluated against a gold standard. It gives a brief summary of the earlier historical developments and focuses on the more recent methods. It covers Bayesian approaches and longitudinal studies with repeated testing. In particular, it reviews the procedures that do not require the assumption of independence between tests conditional on the true disease status.

241 citations


Journal ArticleDOI
TL;DR: In this paper, a case study of the application of the Bayesian strategy to inversion of surface seismic field data is presented, where the authors use Bayes theorem to combine this probability with the data misfit function into a final a posteriori probability density reflecting both data fit and model reasonableness.
Abstract: The goal of geophysical inversion is to make quantitative inferences about the Earth from remote observations. Because the observations are finite in number and subject to uncertainty, these inferences are inherently probabilistic. A key step is to define what it means for an Earth model to fit the data. This requires estimation of the uncertainties in the data, both those due to random noise and those due to theoretical errors. But the set of models that fit the data usually contains unrealistic models; i.e., models that violate our a priori prejudices, other data, or theoretical considerations. One strategy for eliminating such unreasonable models is to define an a priori probability density on the space of models, then use Bayes theorem to combine this probability with the data misfit function into a final a posteriori probability density reflecting both data fit and model reasonableness. We show here a case study of the application of the Bayesian strategy to inversion of surface seismic field data. Assuming that all uncertainties can be described by multidimensional Gaussian probability densities, we incorporate into the calculation information about ambient noise, discretization errors, theoretical errors, and a priori information about the set of layered Earth models derived from in situ petrophysical measurements. The result is a probability density on the space of models that takes into account all of this information. Inferences on model parameters can be derived by integration of this function. We begin by estimating the parameters of the Gaussian probability densities assumed to describe the data and model uncertainties. These are combined via Bayes theorem. The a posteriori probability is then optimized via a nonlinear conjugate gradient procedure to find the maximum a posteriori model. Uncertainty analysis is performed by making a Gaussian approximation of the a posteriori distribution about this peak model. We present the results of this analysis in three different forms: the maximum a posteriori model bracketed by one standard deviation error bars, pseudo-random simulations of the a posteriori probability (showing the range of typical subsurface models), and marginals of this probability at selected depths in the subsurface. The models we compute are consistent both with the surface seismic data and the borehole measurements, even though the latter are well below the resolution of the former. We also contrast the Bayesian maximum a posteriori model with the Occam model, which is the smoothest model that fits the surface seismic data alone.

211 citations


Journal ArticleDOI

202 citations


Journal ArticleDOI
TL;DR: Likelihood ratios (LRs) combine the stability of sensitivity and specificity to provide an omnibus index of test performance far more useful than its constituent parts.

197 citations


Journal ArticleDOI
TL;DR: In this paper, the hierarchical Bayes procedure is implemented via Markov chain Monte Carlo integration techniques for a unified analysis of both discrete and continuous data and a general theorem is provided that ensures the propriety of posteriors under diffuse priors.
Abstract: Bayesian methods have been used quite extensively in recent years for solving small-area estimation problems. Particularly effective in this regard has been the hierarchical or empirical Bayes approach, which is especially suitable for a systematic connection of local areas through models. However, the development to date has mainly concentrated on continuous-valued variates. Often the survey data are discrete or categorical, so that hierarchical or empirical Bayes techniques designed for continuous variates are inappropriate. This article considers hierarchical Bayes generalized linear models for a unified analysis of both discrete and continuous data. A general theorem is provided that ensures the propriety of posteriors under diffuse priors. This result is then extended to the case of spatial generalized linear models. The hierarchical Bayes procedure is implemented via Markov chain Monte Carlo integration techniques. Two examples (one featuring spatial correlation structure) are given to illu...

190 citations


Posted Content
TL;DR: In this article, the mean of the coefficients in a dynamic panel data model when the coefficients are assumed to be randomly distributed across cross-sectional units is estimated using Markov chain Monte Carlo methods, and the performance of the Bayes estimator for the short run coefficients in dynamic panels is compared against alternative estimators using both simulated and real data.
Abstract: This study is concerned with estimating the mean of the coefficients in a dynamic panel data model when the coefficients are assumed to be randomly distributed across cross- sectional units The authors suggest a Bayes approach to the estimation of such models using Markov chain Monte Carlo methods They establish the asymptotic equivalence of the Bayes estimator and the mean group estimator proposed by Pesaran and Smith (1995), and show that the Bayes estimator is asymptotically normal for large N (the number of units) and large T (the number of time periods) so long as /N/T60 as both N> and T 64 The performance of the Bayes estimator for the short-run coefficients in dynamic panels is also compared against alternative estimators using both simulated and real data The Monte Carlo results show that the Bayes estimator has better sampling properties than other estimators for both small and moderate T samples The analysis of Tobin's q model yields new results

187 citations


Journal ArticleDOI
TL;DR: This paper suggests several item selection criteria for adaptive testing which are all based on the use of the true posterior, and some of the statistical properties of the ability estimator produced by these criteria are discussed and empirically characterized.
Abstract: Owen (1975) proposed an approximate empirical Bayes procedure for item selection in computerized adaptive testing (CAT). The procedure replaces the true posterior by a normal approximation with closed-form expressions for its first two moments. This approximation was necessary to minimize the computational complexity involved in a fully Bayesian approach but is no longer necessary given the computational power currently available for adaptive testing. This paper suggests several item selection criteria for adaptive testing which are all based on the use of the true posterior. Some of the statistical properties of the ability estimator produced by these criteria are discussed and empirically characterized.

167 citations


Journal ArticleDOI
TL;DR: Three candidate approaches are evaluated and compared: the posterior means, the constrained Bayes estimates of Louis and Ghosh, and a new approach that optimizes estimation of the histogram and the ranks and is supported by mathematical and simulation‐based analyses.
Abstract: The beauty of the Bayesian approach is its ability to structure complicated models, inferential goals and analyses. To take full advantage of it, methods should be linked to an inferential goal via a loss function. For example, in the two-stage, compound sampling model the posterior means are optimal under squared error loss. However, they can perform poorly in estimating the histogram of the parameters or in ranking them. ‘Triple-goal’ estimates are motivated by the desire to have a set of estimates that produce good ranks, a good parameter histogram and good co-ordinate-specific estimates. No set of estimates can simultaneously optimize these three goals and we seek a set that strikes an effective trade-off. We evaluate and compare three candidate approaches: the posterior means, the constrained Bayes estimates of Louis and Ghosh, and a new approach that optimizes estimation of the histogram and the ranks. Mathematical and simulation-based analyses support the superiority of the new approach and document its excellent performance for the three inferential goals.

Journal ArticleDOI
TL;DR: In this paper, a limiting procedure that provides a solid justification for the use of Bayes factor with intrinsic priors for model comparison is presented for nested and non-nested models.
Abstract: Improper priors typically arise in default Bayesian estimation problems. In the Bayesian approach to model selection or hypothesis testing, the main tool is the Bayes factor. When improper priors for the parameters appearing in the models are used, the Bayes factor is not well defined. The intrinsic Bayes factor introduced by Berger and Pericchi is an interesting method for overcoming that difficulty. That method is of particular interest as a means for generating proper prior distributions (intrinsic priors) for model comparison from the improper priors typically used in estimation. The goal of this article is to develop a limiting procedure that provides a solid justification for the use of Bayes factor with intrinsic priors. The procedure is formalized and discussed for nested and nonnested models. Illustrations and comparisons with other approximations to Bayes factors, such as the Bayesian information criterion of Schwarz and the fractional Bayes factor of O'Hagan are provided.

Journal ArticleDOI
TL;DR: New statistical methods recently developed for the analysis of maps of disease rates when the geographic units have small populations at risk adopt the Bayesian approach and use intensive computational methods for estimating risk in each area.
Abstract: This article presents statistical methods recently developed for the analysis of maps of disease rates when the geographic units have small populations at risk. They adopt the Bayesian approach and use intensive computational methods for estimating risk in each area. The objective of the methods is to separate the variability of rates due to differences between regions from the background risk due to pure random fluctuation. Risk estimates have a total mean quadratic error smaller than usual estimates. We apply these new methods to estimate infant mortality risk in the municipalities of the State of Minas Gerais in 1994.

Journal ArticleDOI
TL;DR: The advantage of the common "Bayesian" interpretation of conventional confidence intervals is that it permits intuitive inferential statements to be made that cannot be made within a conventional framework and this can help to ensure that logical decisions are taken on the basis of study results.
Abstract: OBJECTIVES: To take the common "Bayesian" interpretation of conventional confidence intervals to its logical conclusion, and hence to derive a simple, intuitive way to interpret the results of public health and clinical studies. DESIGN AND SETTING: The theoretical basis and practicalities of the approach advocated is at first explained and then its use is illustrated by referring to the interpretation of a real historical cohort study. The study considered compared survival on haemodialysis (HD) with that on continuous ambulatory peritoneal dialysis (CAPD) in 389 patients dialysed for end stage renal disease in Leicestershire between 1974 and 1985. Careful interpretation of the study was essential. This was because although it had relatively low statistical power, it represented all of the data that were available at the time and it had to inform a critical clinical policy decision: whether or not to continue putting the majority of new patients onto CAPD. MEASUREMENTS AND ANALYSIS: Conventional confidence intervals are often interpreted using subjective probability. For example, 95% confidence intervals are commonly understood to represent a range of values within which one may be 95% certain that the true value of whatever one is estimating really lies. Such an interpretation is fundamentally incorrect within the framework of conventional, frequency-based, statistics. However, it is valid as a statement of Bayesian posterior probability, provided that the prior distribution that represents pre-existing beliefs is uniform, which means flat, on the scale of the main outcome variable. This means that there is a limited equivalence between conventional and Bayesian statistics, which can be used to draw simple Bayesian style statistical inferences from a standard analysis. The advantage of such an approach is that it permits intuitive inferential statements to be made that cannot be made within a conventional framework and this can help to ensure that logical decisions are taken on the basis of study results. In the particular practical example described, this approach is applied in the context of an analysis based upon proportional hazards (Cox) regression. MAIN RESULTS AND CONCLUSIONS: The approach proposed expresses conclusions in a manner that is believed to be a helpful adjunct to more conventional inferential statements. It is of greatest value in those situations in which statistical significance may bear little relation to clinical significance and a conventional analysis using p values is liable to be misleading. Perhaps most importantly, this includes circumstances in which an important public health or clinical decision must be based upon a study that has unavoidably low statistical power. However, it is also useful in situations in which a decision must be based upon a large study that indicates that an effect that is highly statistically significant seems too small to be of practical relevance. In the illustrative example described, the approach helped in making a decision regarding the use of CAPD in Leicestershire during the latter half of the 1980s.

Journal ArticleDOI
TL;DR: It is shown that forward sampling can always identify the optimal sequential strategy in the case of a one-parameter exponential family with a conjugate prior and monotone loss functions as well as the best member of a certain class of strategies when backward induction is infeasible.
Abstract: Unlike traditional approaches, Bayesian methods enable formal combination of expert opinion and objective information into interim and final analyses of clinical trial data. However, most previous Bayesian approaches have based the stopping decision on the posterior probability content of one or more regions of the parameter space, thus implicitly determining a loss and decision structure. In this paper, we offer a fully Bayesian approach to this problem, specifying not only the likelihood and prior distributions but appropriate loss functions as well. At each data monitoring point, we enumerate the available decisions and investigate the use of backward induction, implemented via Monte Carlo methods, to choose the optimal course of action. We then present a forward sampling algorithm that substantially eases the analytic and computational burdens associated with backward induction, offering the possibility of fully Bayesian optimal sequential monitoring for previously untenable numbers of interim looks. We show that forward sampling can always identify the optimal sequential strategy in the case of a one-parameter exponential family with a conjugate prior and monotone loss functions as well as the best member of a certain class of strategies when backward induction is infeasible. Finally, we illustrate and compare the forward and backward approaches using data from a recent AIDS clinical trial.

Journal ArticleDOI
TL;DR: Bayes' Theorem has the advantage that, unlike any other method of statistical inference, it gives the answer directly as a probability, and does not need to be hedged about with qualifications, unlike a significance level.
Abstract: We should really like to know, at the end of study, the probability that we have found a linkage, as pointed out by Cedric Smith (1959) >35 years ago. (Elston 1997)Bayes' Theorem has the advantage that, unlike any other method of statistical inference, it gives the answer directly as a probability.…This probability has a direct meaning, and does not need to be hedged about with qualifications, unlike a significance level. (Smith 1959)

Journal ArticleDOI
TL;DR: In this article, the transferable belief model (TBM) is applied to the problem of diagnosing a disease in an environment poised by uncertainty, and four examples of diagnostic process within the TBM are presented.
Abstract: Short presentation of the most relevant elements of the transferable belief model and its use for problems related to the diagnostic process. These examples illustrate the use of the transferable belief model and in particular of the Generalized Bayesian Theorem. Uncertainty is classically represented by probability functions, and diagnostic in an environment poised by uncertainty is usually handled through the application of the Bayesian theorem that permits the computation of the posterior probability over the diagnostic categories given the observed data from the prior probability over the same categories. We show here that the whole problem admits a similar solution when uncertainty is quantified by belief functions as in the transferable belief model. The classical Bayesian theorem admits a generalization within the transferable belief model (TBM) that we called the Generalized Bayesian Theorem (Smets, 1978, 1981, 1993a). This theorem seems to have been often overlooked, and the use of conditional belief functions for diagnostic problems neglected. The Generalized Bayesian Theorem (GBT) permits the computation of the conditional belief over the diagnostic classes given an observed data from the knowledge of the set of the conditional beliefs about which data will be observed when the case belongs to a given diagnostic category. Loosely expressed, this inversion theorem permits to pass from a belief on the symptoms given the diseases to a belief on the diseases given the symptoms. We present hereafter four examples of diagnostic process within the TBM, and compared the TBM solution with its obvious contender, the probability solution. The examples are analyzed in detail in order to give a clear understanding of the exact use of the TBM and its GBT. We restrict ourself to ‘simple’ examples, cases of complex systems and common or dependent causes are not tackled. Our aim is in showing how the classical Bayesian theorem can be extended and applied within the TBM framework.

Journal ArticleDOI
TL;DR: In this article, the authors present direct conditional imprecise probabilities for the number of successes in a finite number of future trials, given information about a limited number of past trials.

Journal ArticleDOI
TL;DR: An application of Bayesian decision theory to the determination of sample size for phase II clinical studies uses the method of backward induction to obtain group sequential designs that are optimal with respect to some specified gain function.
Abstract: This paper describes an application of Bayesian decision theory to the determination of sample size for phase II clinical studies. The approach uses the method of backward induction to obtain group sequential designs that are optimal with respect to some specified gain function. A gain function is proposed focussing on the financial costs of, and potential profits from, the drug development programme. On the basis of this gain function, the optimal procedure is also compared with an alternative Bayesian procedure proposed by Thall and Simon. The latter method, which tightly controls type I error rate, is shown to lead to an expected gain considerably smaller than that from the optimal test. Gain functions with respect to which Thall and Simon's boundary is optimal are sought and it is shown that these can only be of the form considered, that is, with constant cost for phase III study and cost of the phase II study proportional to the sample size, if potential profit increases over time.

Journal ArticleDOI
01 Aug 1998-Genetics
TL;DR: The view that a population will never be exactly in HWE is endorsed and there will be occasions when there is a need for an alternative to the usual hypothesis-testing setting, and Bayesian methods provide such an alternative, and this approach differs from previous Bayesian treatments in using the disequilibrium and inbreeding coefficient parameterizations.
Abstract: A Bayesian method for determining if there are large departures from independence between pairs of alleles at a locus, Hardy-Weinberg equilibrium (HWE), is presented. We endorse the view that a population will never be exactly in HWE and that there will be occasions when there is a need for an alternative to the usual hypothesis-testing setting. Bayesian methods provide such an alternative, and our approach differs from previous Bayesian treatments in using the disequilibrium and inbreeding coefficient parameterizations. These are easily interpretable but may be less mathematically tractable than other parameterizations. We examined the posterior distributions of our parameters for evidence that departures from HWE were large. For either parameterization, when a conjugate prior was used, the prior probability for small departures was itself small, i.e., the prior was weighted against small departures from independence. We could avoid this uneven weighting by using a step prior which gave equal weighting to both small and large departures from HWE. In most cases, the Bayesian methodology makes it clear that there are not enough data to draw a conclusion.

Journal ArticleDOI
TL;DR: A methodology based on genetic algorithms for the automatic induction of Bayesian networks from a file containing cases and variables related to the problem of predicting survival of people after 1, 3 and 5 years of being diagnosed as having malignant skin melanoma is introduced.

Journal ArticleDOI
TL;DR: This article found that individuals who viewed P(D/∼H) as relevant in a selection task and who used it to make the proper Bayesian adjustment in a probability assessment task scored higher on tests of cognitive ability and were better deductive and inductive reasoners.
Abstract: In two experiments, involving over 900 subjects, we examined the cognitive correlates of the tendency to viewP(D/∼H) and base rate information as relevant to probability assessment. We found that individuals who viewedP(D/∼H) as relevant in a selection task and who used it to make the proper Bayesian adjustment in a probability assessment task scored higher on tests of cognitive ability and were better deductive and inductive reasoners. They were less biased by prior beliefs and more datadriven on a covariation assessment task. In contrast, individuals who thought that base rates were relevant did not display better reasoning skill or higher cognitive ability. Our results parallel disputes about the normative status of various components of the Bayesian formula in interesting ways. It is argued that patterns of covariance among reasoning tasks may have implications for inferences about what individuals are trying to optimize in a rational analysis (J. R. Anderson, 1990, 1991).

Journal ArticleDOI
Colin L. Mallows1
TL;DR: This paper argued that subjective probabilities lie at the second level of the hierarchy of probabilities and that subjective probability does not exist at the first level of probability, since subjective probabilities can be determined to arbitrary accuracy by elicitation, combined with the requirement of consistency, and so "exist" objectively (though only inside someone's head).
Abstract: mathematical theory, proceeding from the axiom that rational behavior is coherent, in the sense that it is not rational to accept an incoherent system of bets. But these bets are mathematical idealizations; the real world is more complex than this theory admits. Some Bayesians, for example Lindley (1965), have argued that subjective probabilities belong at the first level, since they can be determined to arbitrary accuracy by elicitation, combined with the requirement of consistency, and so "exist" objectively (though only inside someone's head). But a corollary of de Finetti's assertion is (pardon me for shouting) SUBJECTIVE PROBABILITY DOES NOT EXIST In my view "probability," whether assigned by a frequentist or by a subjective Bayesian, lies firmly at the second level. Phenomena may exhibit stable frequencies and may be unpredictable, but an iid or exchangeable model is only a convenient approximation to reality. Fisher's position in his 1922 paper was very clear. He emphasizes that the probability 1/6 of throwing a five with a die refers to ... a hypothetical population of an infinite number of throws, with the die in its original condition. Our statement will not ... contain any false assumption about the actual die ... or any ... approximation ... Similarly, in my view, personal probabilities are no more than a convenient fiction. I find this view-that a probability is nothing more than a model of reality-to be helpful in resolving the controversy between various schools of thought. Under this view, a frequentist may recognize symmetry among n equally likely cases and so assign a probability 1/n to each. If new evidence suggests that this assignment is inappropriate (perhaps he learns that the die is biased), he may amend this assignment. Similarly, a subjective Bayesian may judge that on his present state of knowledge, some system of bets is acceptable, but if new insights arise he may amend these judgments by means other than Bayes' theorem. (See Leamer (1978), sec. 9.1). In any case, however the probabilities are assigned, the mathematical theory by which theorems are proved belongs at the third level. Thus, in my view, an assignment of probability is like any other scientific theory; not to be "believed," but merely "assumed" for convenience until conflicting evidence is found. This corresponds to the so-called "Copenhagen" interpretation of quantum mechanics. A difficulty arises-how can we assess the strength of the conflicting evidence? No amount of discrepant evidence can conclusively disprove a probability model. Many discussions remark on the incestuous nature of the situation: the adequacy of the model is to be assessed using the model itself. But there is no difficulty in using the approach suggested above in terms of "similarity" judgments, with the components of the judgment chosen with a view to the prospective uses of the model. So here I am rejoining the main thread of my argument. 8. THE FEDERALIST AGAIN I would like to expand on this point with respect to the Federalist study. The model MW and we could check explicitly that the frequencies are empirically approximately independent. Assessing the adequacy of this model is a matter of judging whether it is "similar" (in senses to be determined) to the data. In making this judgment, we might compute various goodness-of-fit measures, but it would be important also to study the sensitivity of the proposed analysis to changes in the model. Thus, MW s3(1O?cr,1O?cr) for T given vr. (MW using data from blocks of known authorship, posterior distributions of the (u, T) parameters (for each word used) could be derived; then using the counts for a disputed paper and integrating out the oand T parameters, an odds ratio for Hamilton versus Madison is obtained. In the event, Madison was strongly favored for each of the 12 disputed papers. The set of words used in this final calculation was chosen after careful study of their discriminatory power in passages of known authorship; but, as M&W remark allowance for selection and regression effects is made entirely through the prior distribuitions of oand -r. We assume that the prior distributions apply to any and every word chosen from some large group of words. Then, according to the model, the prior distribution will reduce the apparent discriminating power to the required extent. With this selection feature, we may choose words for inclusion by whatever methods are convenient, so long as they are independent of the unknown papers. M&W call their whole approach a Bayesian analysis, but this is the only point where a subjective judgment, not based directly on data, is made. What kind of probability is being assumed here? How might such a model be challenged? We would need to find some situation, or preferably several situations, that we judge to be "similar" to this, and where it is possible to measure somehow whether this assumption is supported by the data. If it is, we can appeal to these analogies to support the assumption in this case. So we would need to consider analyses of other authorship problems (possibly in other languages). A cross-validation study (using subsets of the data and subsets of the words) could contribute to understanding how sensitive the whole analysis is to this assumption.

Journal ArticleDOI
TL;DR: It is proved theoretically that three-layer neural networks with at least 2n hidden units have the capability of approximating the a posteriori probability in the two-category classification problem with arbitrary accuracy and secondly, it is proved that the input-output function of neural networkswith at least2nhidden units tends to the a anteriori probability as Back-Propagation learning proceeds ideally.

Journal ArticleDOI
TL;DR: In this paper, two indirect inference estimators are proposed based on the choice of an autoregressive auxiliary model and an ARMA auxiliary model, respectively, which make the auxiliary parameter easy to estimate and at the same time allow the derivation of optimal indirect inference estimation estimators.
Abstract: Summary We propose as a tool for the estimation of stochastic volatility models two indirect inference estimators based on the choice of an autoregressive auxiliary model and an ARMA auxiliary model, respectively. These choices make the auxiliary parameter easy to estimate and at the same time allow the derivation of optimal indirect inference estimators. The results of some Monte Carlo experiments provide evidence that the indirect inference estimators perform well in finite sample, although less efficiently than Bayes and Simulated EM algorithms.

Journal ArticleDOI
TL;DR: A number of estimation approaches which have been suggested for population pharmacokinetic analyses are reviewed, distinguishing between Bayesian and non-Bayesian and fully-parametric, semi- parametric and nonparametric methods.
Abstract: A principal aim of population pharmacokinetic studies is to estimate the variance components associated with intra- and inter-individual variability in observed drug concentrations. The explanation of the inter-individual variability in terms of subject-specific covariates is also of great importance. Pharmacokinetic models are nonlinear in the parameters and estimation is not straightforward. Within this paper we review a number of estimation approaches which have been suggested for population pharmacokinetic analyses. We distinguish between Bayesian and non-Bayesian and fully-parametric, semi-parametric and nonparametric methods.

Proceedings ArticleDOI
01 Dec 1998
TL;DR: A new method (particularly suited to the analysis of High Throughput Screening data) is presented for the determination of quantitative structure activity relationships, which exhibits high accuracy and is robust to measurement errors.
Abstract: A new method (particularly suited to the analysis of High Throughput Screening data) is presented for the determination of quantitative structure activity relationships. The method, termed "Binary QSAR," accepts binary activity measurements (e.g., pass/fail or active/inactive) and molecular descriptor vectors as input. A Bayesian inference technique is used to predict whether or not a new compound will be active or inactive. Experiments were conducted on a data set of 1947 molecules. The results show that the method exhibits high accuracy and is robust to measurement errors.

Journal ArticleDOI
TL;DR: In this paper, the superharmonicity of the square root of the marginal density of a multivariate Student-t prior is used to construct a proper Bayes estimator.
Abstract: Bayes estimation of the mean of a multivariate normal distribution is considered under quadratic loss. We show that, when particular spherical priors are used, the superharmonicity of the square root of the marginal density provides a viable method for constructing (possibly proper) Bayes (and admissible) minimax estimators. Examples illustrate the theory; most notably it is shown that a multivariate Student-t prior yields a proper Bayes minimax estimate. 1. Introduction. When estimating the mean of a multivariate distribution, the two dominant approaches are the minimax approach and variants of the Bayes paradigm. The first has received the most extensive development while the second is most used in practice, due to its great flexibility. See [4] for a study of the interface between these two approaches. The problem of both methods is that neither necessarily leads to admissible estimators (hierarchical Bayes estimators are often only generalized Bayes estimators). Even if admissibility may provide nonreasonable estimators, it is a criterion which can be desirable when combining minimaxity and Bayesian properties. Indeed, since the sampling distribution is normal, under quadratic loss, the Bayes estimator is unique provided the Bayes risk is finite, so that the proper Bayes estimator is admissible (see [16]). In [5], Brown conjectured that, for estimating a multivariate normal mean using quadratic loss, a proper Bayes minimax estimator does not exist for four or less dimensions. This conjecture was proved by Strawderman [21], who also settled the conjecture for dimensions five or more that such estimators do indeed exist. Stein [19] obtains minimaxity of a general estimator δ of the form δ� x �= ∞

Journal ArticleDOI
04 Dec 1998
TL;DR: In this article, the authors review and extend the theory and practice of minimum relative entropy (MRE) and compare MaxEnt, smallest model and MRE approaches for the density distribution of an equivalent spherically symmetric earth and for the contaminant plume-source problem.
Abstract: The similarity between maximum entropy (MaxEnt) and minimum relative entropy (MRE) allows recent advances in probabilistic inversion to obviate some of the shortcomings in the former method. The purpose of this paper is to review and extend the theory and practice of minimum relative entropy. In this regard, we illustrate important philosophies on inversion and the similarly and differences between maximum entropy, minimum relative entropy, classical smallest model (SVD) and Bayesian solutions for inverse problems. MaxEnt is applicable when we are determining a function that can be regarded as a probability distribution. The approach can be extended to the case of the general linear problem and is interpreted as the model which fits all the constraints and is the one model which has the greatest multiplicity or “spreadout” that can be realized in the greatest number of ways. The MRE solution to the inverse problem differs from the maximum entropy viewpoint as noted above. The relative entropy formulation provides the advantage of allowing for non-positive models, a prior bias in the estimated pdf and `hard' bounds if desired. We outline how MRE can be used as a measure of resolution in linear inversion and show that MRE provides us with a method to explore the limits of model space. The Bayesian methodology readily lends itself to the problem of updating prior probabilities based on uncertain field measurements, and whose truth follows from the theorems of total and compound probabilities. In the Bayesian approach information is complete and Bayes' theorem gives a unique posterior pdf. In comparing the results of the classical, MaxEnt, MRE and Bayesian approaches we notice that the approaches produce different results. In␣comparing MaxEnt with MRE for Jayne's die problem we see excellent comparisons between the results. We compare MaxEnt, smallest model and MRE approaches for the density distribution of an equivalent spherically-symmetric earth and for the contaminant plume-source problem. Theoretical comparisons between MRE and Bayesian solutions for the case of the linear model and Gaussian priors may show different results. The Bayesian expected-value solution approaches that of MRE and that of the smallest model as the prior distribution becomes uniform, but the Bayesian maximum aposteriori (MAP) solution may not exist for an underdetermined case with a uniform prior.

Journal ArticleDOI
TL;DR: Attributes based on cysteine, the aromatics, flexible tendencies, and charge were found to be the best attributes for distinguishing order and disorder among those tested so far.
Abstract: The conditional probability, P(s|x), is a statement of the probability that the event, s, will occur given prior knowledge for the value of x. If x is given and if s is randomly distributed, then an empirical approximation of the true conditional probability can be computed by the application of Bayes' Theorem. Here s represents one of two structural classes, either ordered, s (o), or disordered, s (d), and x represents an attribute value calculated over a window of 21 amino acids. Plots of P(s|x) versus x provide information about the correlation between the given sequence attribute and disorder or order. These conditional probability plots allow quantitative comparisons between individual attributes for their ability to discriminate between order and disorder states. Using such quantitative comparisons, 38 different sequence attributes have been rank-ordered. Attributes based on cysteine, the aromatics, flexible tendencies, and charge were found to be the best attributes for distinguishing order and disorder among those tested so far.