scispace - formally typeset
Search or ask a question

Showing papers on "Bayes' theorem published in 1977"


Journal ArticleDOI
Friedman1
TL;DR: A new criterion for deriving a recursive partitioning decision rule for nonparametric classification is presented and the resulting decision rule is asymptotically Bayes' risk efficient.
Abstract: A new criterion for deriving a recursive partitioning decision rule for nonparametric classification is presented The criterion is both conceptually and computationally simple, and can be shown to have strong statistical merit The resulting decision rule is asymptotically Bayes' risk efficient The notion of adaptively generated features is introduced and methods are presented for dealing with missing features in both training and test vectors

393 citations


Journal ArticleDOI
R. Kashyap1
TL;DR: The optimum decision rule is asymptotically consistent and gives a quantitative explanation for the "principle of parsimony" often used in the construction of models from empirical data.
Abstract: This paper deals with the Bayesian methods of comparing different types of dynamical structures for representing the given set of observations. Specifically, given that a given process y(\cdot) obeys one of r distinct stochastic or deterministic difference equations each involving a vector of unknown parameters, we compute the posterior probability that a set of observations {y(1),...,y(N)} obeys the i th equation, after making suitable assumptions about the prior probability distribution of the parameters in each equation. The difference equations can be nonlinear in the variable y but should be linear in the parameter vector in it. Once the posterior probability is known, we can find a decision rule to choose between the various structures so as to minimize the average value of a loss function. The optimum decision rule is asymptotically consistent and gives a quantitative explanation for the "principle of parsimony" often used in the construction of models from empirical data. The decision rule answers a wide variety of questions such as the advisability of a nonlinear transformation of data, the limitations of a model which yields a perfect fit to the data (i.e., zero residual variance), etc. The method can be used not only to compare different types of structures but also to determine a reliable estimate of spectral density of process. We compare the method in detail with the hypothesis testing method, and other methods and give a number of illustrative examples.

156 citations



Journal ArticleDOI
TL;DR: The authors showed that three different types of nondiagnostic samples (neutral, irrelevant, and null) consistently resulted in less extreme inference judgments, and that the size of this effect depended on serial location and was less with aggregate than sequential samples.

106 citations


Journal ArticleDOI
TL;DR: In this article, the authors combine one-sided sequential probability ratio tests (SPRTs) for binomial decision problems with error probability constraints to minimize the expected sample sizes to within o(1) asymptotically.
Abstract: Combinations of one-sided sequential probability ratio tests (SPRT's) are shown to be "nearly optimal" for problems involving a finite number of possible underlying distributions. Subject to error probability constraints, expected sample sizes (or weighted averages of them) are minimized to within o(1) asymptotically. For sequential decision problems, simple explicit procedures are proposed which "do exactly what a Bayes solution would do" with probability approaching one as the cost per observation, c, goes to zero. Exact computations for a binomial testing problem show that efficiencies of about 97% are obtained in some "small-sample" cases.

80 citations


Journal ArticleDOI
TL;DR: In this article, the tail area for a nested sharp hypothesis is compared to the Bayes factor based on the event of "significance" considered as data, which is expressed as a weighted average of full-data Bayes factors.
Abstract: Inequalities are given relating the tail area for a nested sharp hypothesis to the Bayes factor based on the event of “significance” considered as data. This Bayes factor based on an insufficient statistic is, in turn, expressed as a weighted average of full-data Bayes factors. Lindley's “statistical paradox” is generalized and other comparisons made in the normal sampling context. A new Bayesian interpretation is given for the traditional two-tailed critical level. An example and the discussion suggest a negative answer to the question in the title.

56 citations


Journal ArticleDOI
01 Jun 1977
TL;DR: The results show the essential robustness of procedures which play Bayes with respect to (a perhaps randomized version of an estimate of the distribution of the past) still have asymptotically good properties even when the underlying assumptions for which they were originally developed no longer hold.
Abstract: Sequential predictors for binary sequences with no assumptions upon the existence of an underlying process are discussed. The rule offered here induces an expected proportion of errors which differs by 0(n-?) from the Bayes envelope with respect to the observed kth order Markov structure. This extends the compound sequential Bayes work of Robbins, Hannan and Blackwell from sequences with perceived 0th order structure to sequences with perceived kth order structure. The proof follows immediately from applying the 0th order theory to 2k separate subsequences. These results show the essential robustness of procedures which play Bayes with respect to (a perhaps randomized) version of an estimate of the distribution of the past. Such procedures still have asymptotically good properties even when the underlying assumptions for which they were originally developed no longer hold.

37 citations


Journal ArticleDOI
TL;DR: A version of a two-class decision problem is considered, and a quasi-Bayes procedure is motivated and defined that mimics closely the formal Bayes solution while involving only a minimal amount of computation.
Abstract: Unsupervised Bayes sequential learning procedures for classification and estimation are often useless in practice because of the amount of computation required. In this paper, a version of a two-class decision problem is considered, and a quasi-Bayes procedure is motivated and defined. The proposed procedure mimics closely the formal Bayes solution while involving only a minimal amount of computation. Convergence properties are established and some numerical illustrations provided. The approach compares favorably with other non-Bayesian learning procedures that have been proposed and can be extended to more general situations.

33 citations


Journal ArticleDOI
TL;DR: In this article, four Monte Carlo simulation studies of Owen's Bayesian sequential procedure for adaptive mental testing were conducted, where the authors explored a number of additional properties, both in a normally distributed population and in a distribution-free context.
Abstract: Four monte carlo simulation studies of Owen's Bayesian sequential procedure for adaptive mental testing were conducted. In contrast to previous simulation studies of this procedure which have concentrated on evaluating it in terms of the corre lation of its test scores with simulated ability in a normal population, these four studies explored a number of additional properties, both in a normally distributed population and in a distribution-free context. Study 1 replicated previous studies with finite item pools, but examined such properties as the bias of estimate, mean absolute error, and cor relation of test length with ability. Studies 2 and 3 examined the same variables in a number of hypo thetical infinite item pools, investigating the effects of item discriminating power, guessing, and vari able vs. fixed test length. Study 4 investigated some properties of the Bayesian test scores as latent trait estimators. The properties of interest included the conditional bias of the ability estimates, the info...

28 citations


Journal ArticleDOI
TL;DR: In this paper, a uniform prior on p is introduced, and the optimal Bayes procedure is shown to exist and to have bounded sample size for any fixed 0 0, this rule performs badly for p close to 0 or 1.
Abstract: number of successes in n trials, a heuristic rule is derived and shown to perform well for any fixed 0 0, this rule performs badly for p close to 0 or 1. To overcome this difficulty a uniform prior on p is introduced, and the optimal Bayes procedure is shown to exist and to have bounded sample size. The optimal Bayes risk is shown to be - 27rci as c - 0, and is computed for various values of c, along with the expected loss for various values of p.

27 citations


Journal ArticleDOI
TL;DR: In this article, the authors investigated the design of a reservoir subject to long-range sediment accumulation stemming from the sum of a random number of random sedimentation events in a case study in southern Arizona.
Abstract: The design of a reservoir subject to long-range sediment accumulation stemming from the sum of a random number of random sedimentation events is investigated. The event-based simulation method, which is applied to a case study in southern Arizona, involves generating synthetic sequences of Poisson inputs into the modified universal soil loss equation. The stochastic inputs result from a fitted bivariate distribution of runoff-producing precipitation events (representing the amount and duration of such precipitation) and an independent fitted exponential distribution of interarrival time between events. The simulated sequences of sediment yield events thus obtained are used to calculate accumulated sediment yield and cost of a given design for each sequence. The optimum design and corresponding Bayes risk are evaluated in four cases: (1) under natural uncertainty, (2) under natural uncertainty and uncertainty in the bivariate rainfall distribution parameters, (3) under natural uncertainty and uncertainty in the Poisson counting distribution parameter, and (4) under all three types of uncertainty. The effect of rainfall record length is ascertained by further computer experiments, but only a partial Bayesian analysis is provided because of the complexity created by a three-dimensional parameter uncertainty. The optimum reservoir capacity and corresponding Bayes risk are shown to increase substantially (up to 20 and 90%, respectively) as more uncertainties are incorporated into the model.

Journal ArticleDOI
TL;DR: A recently developed framework for comparing the properties of various conditional procedures is studied in detail in the setting of testing between two simple hypotheses, where the ideas are most transparent.
Abstract: A recently developed framework for comparing the properties of various conditional procedures is studied in detail in the setting of testing between two simple hypotheses, where the ideas are most transparent. In that setting, possible goodness criteria are considered, and illustrations are given. The conditional confidence methodology, unlike the Bayes, fiducial, and likelihood techniques, presents a measure of conclusiveness which has frequentist interpretability; and, unlike traditional Neyman-Pearson procedures, the measure is highly data-dependent.


Journal ArticleDOI
Masahiko Okada1, N. Maruyama1, T. Kanda1, K. Shirakawa1, T. Katagiri1 
TL;DR: An experiment on a medical information system in which a clinical data base is combined organically with computer programs for automated diagnosis, which can be regarded as a data base possessing a kind of diagnosing ability which grows up with time.

Journal ArticleDOI
TL;DR: General inequalities and theorems for the empirical Bayes multiple decision problem with particular applications to a classification problem and a linear-loss monotone multiple decision problems are presented.
Abstract: In the empirical Bayes approach to multiple decision problems, we obtain theorems and lemmas which can be used to obtain asymptotic optimality and rate results in any multiple decision empirical Bayes problem. Applications of these results to a classification problem, a monotone multiple decision, and a selection problem are given. In addition, a special lemma unique to the monotone multiple decision problem gives improved (exact) rate results in that case.


Journal ArticleDOI
TL;DR: In this paper, a procedure for the construction of a monotone estimator that dominates a given estimator for a class of discrete distributions with monotonous likelihood ratio is given.
Abstract: Summary A procedure is given for the construction of a monotone estimator that dominates a given estimator for a class of discrete distributions with monotone likelihood ratio. This procedure is applied to some empirical Bayes estimators. Monte Carlo results are given that demonstrate the usefulness of monotonizing.


Journal ArticleDOI
TL;DR: Two distinict Bayesian miiethodologies are developed and compared Jor inJference on gammiiiia scale paramiieters in onie and two population problem1s, and appear to conJflict in hypothesis infierence problemiis and to harmlonize in initerval estimiiationi problems1s.
Abstract: Summary Two distinict Bayesian miiethodologies are developed and compared Jor inJference on gammiiiia scale paramiieters in onie and two population problem1s. Both approaches permit conconmitant variables anid cenisored observations in the exponential case. The first approach, based on the use oJ'n atural-conjugate prior distributions, genieralizes and harmnonizes with the traditionalfrequentist analvsis in terms of'x2 ad F distributions. The second miethod is based on non-continiuous-type extenisionis oJ the natural conijugate priors, and involves the use of Bayes factors Jor sharply defined hypotheses. The two methods appear to conJflict in hypothesis infierence problemiis and to harmlonize in initerval estimiiationi problems1s. Inferences fIromi the two methods are comiipared Jor survival data fromii a Hodgkini's disease therapy trial.


Journal ArticleDOI
TL;DR: In this article, the Dirichlet prior process and the squared error loss were used to obtain the bounds of the U-statistics of degrees 2 and 3 for estimating estimable parameters.
Abstract: The Bayes estimates of estimable parameters of degrees 2 and 3 are obtained against a Dirichlet prior process and the squared error loss. The relations between the limits of the Bayes estimates and the U-statistics are shown. Some examples are given.

Journal ArticleDOI
TL;DR: In this article, it was shown that the mean credibility formula is a Bayes rule within a nonparametric context, and the credibility factor is obtained as a simple function of the parameter that characterizes the prior distribution.
Abstract: Recent advances in statistical decision theory and stochastic processes provide the machinery for showing that the celebrated mean credibility formula is a Bayes rule within a nonparametric context. The credibility factor is obtained as a simple function of the parameter that characterizes the prior distribution. A natural estimator of leads to a credibility formula having a form similar to the James-Stein estimator.

Journal ArticleDOI
TL;DR: In this article, the classical or likelihood definition of model identification is shown to be too restrictive since there exist some classically unestimable and unidentifiable models that have unique Bayes estimates.
Abstract: From a Bayesian framework the classical or likelihood definition of model identification is shown to be too restrictive since there exist some classically unestimable and unidentifiable models that have unique Bayes estimates. Consequently, another definition of identification is developed. Finally, the general issue of model complexity and estimation techniques is discussed.


Journal ArticleDOI
TL;DR: In this paper, a nonparametric technique for estimating the Bayes error for any two-category feature extractor is presented and is shown that this technique is better than the existing methods, and the estimates obtained are more meaningful in evaluating the quality of feature extractors.
Abstract: Since the Bayes classifier is the optimum classifier in the sense of having minimum probability of misclassification among all the classifiers using the same set of pattern features, the error rate of the Bayes classifier using the set of features provided by a feature extractor, called the Bayes error of the feature extractor, is the smallest possible for the feature extractor. Consequently, the Bayes error can be used to evaluate the effectiveness of the feature extractors in a pattern recognition system. In this paper, a nonparametric technique for estimating the Bayes error for any two-category feature extractor is presented. This technique uses the nearest neighbor sample sets and is based on an infinite series expansion of the general form of the Bayes error. It is shown that this technique is better than the existing methods, and the estimates obtained by this technique are more meaningful in evaluating the quality of feature extractors. Computer simulation as well as application to electrocardiogram analysis are used to demonstrate this technique.

Journal ArticleDOI
TL;DR: In this article, an empirical Bayes procedure is considered for estimating a mean value function imbedded within a collection of N stationary time series, which is a fixed function of time and one that is a stationary normal process.
Abstract: An empirical Bayes procedure is considered for estimating a mean value function imbedded within a collection of N stationary time series. Two cases are considered: a mean value function that is a fixed function of time and one that is a stationary normal process. Estimators for the spectral densities are derived from prior data using an extension to time series of the usual one-way analysis of variance. These estimators are inserted into approximations for the classical Bayes estimators to obtain empirical Bayes estimators. Asymptotic mean square errors of the empirical Bayes and maximum likelihood estimators are compared by an example.


Book ChapterDOI
Herman Rubin1
01 Jan 1977
TL;DR: In this paper, the authors discuss the robustness of the behavioristic Bayes approach for estimating the mean of a univariate normal with known variance, showing that the simple linear estimator has an infinite expected risk if the loss is squared error and the true prior has infinite variance.
Abstract: Publisher Summary There is a considerable body of literature expounding that proper statistical behavior is to take a convenient but not necessarily true prior, compute the posterior, and act accordingly. This chapter describes the way in which much can be gained with little lost, at least in the case of estimating the mean of a univariate normal with known variance. The simple linear estimator has an infinite expected risk if the loss is squared error and the true prior has infinite variance. Even truncating the risk does not help much. The chapter discusses the concept of robustness from the behavioristic Bayes approach. It describes the calculation of the risks for the Bayes estimates for non-normal priors by numerical integration for certain values of θ, followed by numerical integration over θ.

Book ChapterDOI
01 Jan 1977
TL;DR: The total Q on test (TQT) statistic as discussed by the authors was proposed to estimate the gamma hyperparameters of a given cumulative failure function for a certain component, and is used to perform Bayesian updating during a variety of lifetime testing programs in a manner similar to total time on test plots.
Abstract: : Suppose the basic shape of the cumulative failure (hazard) function has been identified for a certain component, and that an unknown parameter theta for a new production run of similar components is to be estimated. In particular, suppose that the failure function is of proportional type, R(x) = thetaQ(x), where Q is the known shape function, and that theta is sampled from a prior gamma density. By using a new statistic, called the total Q on test (TQT), it is possible to perform Bayesian updating during a variety of lifetime testing programs in a manner similar to total time on test plots. This statistic can also be used with complete lifetime data, extending over several product runs, to identify the failure form Q, and to estimate the gamma hyperparameters. Extensions include the use of several TQT statistics to estimate the relative strength of competing hazard functions.

Journal ArticleDOI
TL;DR: In this paper, it is shown that the optimal sequential design depends on the ratio of the posterior variances of the two normal distributions, and there exist constants such that, when the above-mentioned ratio exceeds this constant, it is optimal to select the next observation from one distribution; unless it is the other distribution.
Abstract: It is desired to estimate the difference between the means of two independent normal distributions as accurately as possible and in a sequential manner when the total number of observations is fixed. The problem is posed in a Bayesian framework with conjugate prior distributions and squared error-loss function. It is shown that the optimal sequential design depends on the ratio of the posterior variances of the two means. There exist constants (dependent on the prior parameters, the number of observations taken from each distribution, and the number of observations remaining) such that, when the above-mentioned ratio exceeds this constant, it is optimal to select the next observation from one distribution; otherwise it is optimal to select it from the other distribution.