scispace - formally typeset
Search or ask a question

Showing papers on "Bayes' theorem published in 1968"


Journal ArticleDOI
TL;DR: In this article, the authors considered the problem of estimating latent ability using the entire response pattern of free-response items, first in the general case and then in the case where the items are scored in a graded way, especially when the thinking process required for solving each item is assumed to be homogeneous.
Abstract: Estimation of latent ability using the entire response pattern of free-response items is discussed, first in the general case and then in the case where the items are scored in a graded way, especially when the thinking process required for solving each item is assumed to be homogeneous. The maximum likelihood estimator, the Bayes modal estimator, and the Bayes estimator obtained by using the mean-square error multiplied by the density function of the latent variate as the loss function are taken as our estimators. Sufficient conditions for the existence of a unique maximum likelihood estimator and a unique Bayes modal estimator are formulated with respect to an individual item rather than with respect to a whole set of items, which are useful especially in the situation where we are free to choose optimal items for a particular examinee out of the item library in which a sufficient number of items are stored with reliable quality controls. Advantages of the present methods are investigated by comparing them with those which make use of conventional dichotomous items or test scores, theoretically as well as empirically, in terms of the amounts of information, the standard errors of estimators, and the mean-square errors of estimators. The utility of the Bayes modal estimator as a computational compromise for the Bayes estimator is also discussed and observed. The relationship between the formula for the item characteristic function and the philosophy of scoring is observed with respect to dichotomous items.

3,031 citations


Journal ArticleDOI
TL;DR: In this article, it was shown that there are no countably additive exchangeable distributions on the space of observations which give ties probability 0 and for which a next observation is conditionally equally likely to fall in any of the open intervals between successive order statistics of a given sample.
Abstract: A Bayesian approach to inference about the percentiles and other characteristics of a finite population is proposed. The approach does not depend upon, though it need not exclude, the use of parametric models.Some related questions concerning the existence of exchangeable distributions are considered. It is shown that there are no countably additive exchangeable distributions on the space of observations which give ties probability 0 and for which a next observation is conditionally equally likely to fall in any of the open intervals between successive order statistics of a given sample.

248 citations


Journal ArticleDOI
TL;DR: Probability and feasibility of inductive logic have been studied extensively in the literature, see as mentioned in this paper for a survey of some of the most relevant works. But the main focus of this paper is on the nature of causality and its nature in the problem of induction.
Abstract: Preface I BASICS OF LOGIC Introduction The Structure of Simple Statements The Structure of Complex Statements Simple and Complex Properties Validity 2 PROBABILITY AND INDUCTIVE LOGIC Introduction Arguments Logic Inductive versus Deductive Logic Epistemic Probability Probability and the Problems of Inductive Logic 3 THE TRADITIONAL PROBLEM OF INDUCTION Introduction Hume"s Argument The Inductive Justification of Induction The Pragmatic Justification of Induction Summary IV THE GOODMAN PARADOX AND THE NEW RIDDLE OF INDUCTION Introduction Regularities and Projection The Goodman Paradox The Goodman Paradox, Regularity, and the Principle of the Uniformity of Nature Summary 5 MILL"S METHODS OF EXPERIMENTAL INQUIRY AND THE NATURE OF CAUSALITY Introduction Causality and Necessary and Sufficient Conditions Mill"s Methods The Direct Method of Agreement The Inverse Method of Agreement The Method of Difference The Combined Methods The Application of Mill"s Methods Sufficient Conditions and Functional Relationships Lawlike and Accidental Conditions 6 THE PROBABILITY CALCULUS Introduction Probability, Arguments, Statements, and Properties Disjunction and Negation Rules Conjunction Rules and Conditional Probability Expected Value of a Gamble Bayes" Theorem Probability and Causality 7 KINDS OF PROBABILITY Introduction Rational Degree of Belief Utility Ramsey Relative Frequency Chance 8 PROBABILITY AND SCIENTIFIC INDUCTIVE LOGIC Introduction Hypothesis and Deduction Quantity and Variety of Evidence Total Evidence Convergence to the Truth ANSWERS TO SELECTED EXERCISES INDEX

185 citations


Journal ArticleDOI
TL;DR: In this article, the authors present a coherent picture of the parameter estimation problem, starting from the theory of minimum risk or Bayes estimation, and show how other statistical estimation techniques can be interpreted as special cases (viz. maximum likelihood-, Markov-, and least squares estimation), the most important properties of these estimates are summarized.

61 citations


Journal ArticleDOI
TL;DR: When data are sampled from a population and subjects revise probability estimates about which population is being sampled, their revisions are less than the optimal amount calculated by using Bayes's theorem; they are conservative as mentioned in this paper.

49 citations



Journal ArticleDOI
TL;DR: In this article, a new estimator for the parameters in the linear regression model is presented, using not only the usual random sample of observations, but also past experience in the form of previous estimates of parameters in similar but independent situations.
Abstract: SUMMARY New estimators for the parameters in the linear regression model are presented, using not only the usual random sample of observations, but also past experience in the form of previous estimates of parameters in similar but independent situations. The regression parameters are considered to be random variables. Bayes estimators are given for a squared error loss function. Even though the prior density of the parameter is unknown, the Bayes estimator can be written in terms of the marginal density of a sufficient statistic. This marginal density can be estimated empirically, thus forming the empirical Bayes estimator. Results of Monte Carlo simulation of empirical Bayes estimators are given which show that, even with a few past experiences, the empirical Bayes estimators have smaller mean squared errors than do the ordinary least-squares estimators, which use only current information.

24 citations


Journal ArticleDOI
H. W. Peers1
TL;DR: In this paper, the authors examined the coverage probability of Bayes interval estimates constructed according to different criteria and made suggestions for the choice of a prior density in the case of complete ignorance concerning the parameter under estimation.
Abstract: SUMMARY Coverage probabilities, in the confidence theory sense, of Bayes interval estimates constructed according to different criteria are examined. Suggestions are made for the choice of a prior density in the case of complete ignorance concerning the parameter under estimation. These densities are members of the family of relatively invariant densities discussed by Hartigan (1964). Confidence properties of a related family are also discussed.

19 citations


Journal ArticleDOI
TL;DR: The use in single-stage medical diagnosis of a recent, nearly assumption-free method of statistical discrimination that involves the application of Bayes' theorem to linearly smoothed, joint-symptom-probability estimates is explored.
Abstract: The use in single-stage medical diagnosis of a recent, nearly assumption-free method of statistical discrimination is explored. The method involves the application of Bayes' theorem to linearly smoothed, joint-symptom-probability estimates. The linear smoothing filter is optimized by an empirical performance-scoring technique that employs a single sample for both development and verification. Advantages of the method: acceptability of a wide variety of nonnormal symptom variables, ranging from dichotomous to seemingly continuous; lack of necessity for independence of symptom variables; simultaneous use of a large number of variables, on the order of magnitude of the number of patients observed; capacity to handle missing observations by creating for each incompletely observed variable a dichotomous dummy variable that indicates the missing values. Disadvantage: lack of automatic identification of important variables, as is obtained when the standard discriminant-function method applies.

18 citations


Journal ArticleDOI
TL;DR: In this article, a Bayes transformation is developed to convert prior estimates and confidence statements on the availability into posterior versions, which are consistent with the equivalent statements on failure-rate and repair-rate parameters.
Abstract: For any physical system there is always some degree of uncertainty regarding the values of the parameters governing the performance of that system. Uncertainties in the values of the failure rate ? and the repair rate s reflect themselves in an uncertainty in the value of the point availability, A = s/(s + ?). Treating these uncertain parameters as random variables, exact expressions for the mean, variance, and distribution of the point availability are derived by combining the distributions of the failure and repair rates. Hence we can construct estimates and confidence statements for the availability which are consistent with the equivalent statements on the failure-rate and repair-rate parameters. Exact mean and variance results are also provided for mission, transient, and other time-dependent availability expressions. The acquisition of failure and repair data introduces additional information as well as additional uncertainties. A Bayes transformation is developed which utilizes the two data sets to readily convert prior estimates and confidence statements on the availability into posterior versions. This particular paper is restricted to a basic model involving an alternating sequence of independent exponentially distributed operational and repair intervals with the respective rate parameters described by distinct gamma distributions. For this model the point availability proves to have an Euler distribution.

17 citations


Journal ArticleDOI
TL;DR: An upper bound estimate on the Bayes mis classification error is derived using a kernel function to estimate probability density functions and a relationship between a Euclidean distance and a probability of misclassification is indicated.
Abstract: An upper bound estimate on the Bayes misclassification error is derived using a kernel function to estimate probability density functions. As a result of this bound, a relationship between a Euclidean distance and a probability of misclassification is indicated.

Journal ArticleDOI
TL;DR: The methods are shown to be asymptotically optimal and results of studies involving finite sample numbers are reported, giving some indication of the rate of approach to optimality in specific cases.
Abstract: SUMMARY The smooth empirical Bayes approach is based on using past observations to estimate a specified type of approximation to the prior distribution. This approach is applied to problems of empirical Bayes hypothesis testing and to the compound decision problem. Problems involving composite hypotheses are considered which have, hitherto, received much less attention than the case of two simple hypotheses. The methods are shown to be asymptotically optimal and results of studies involving finite sample numbers are reported, giving some indication of the rate of approach to optimality in specific cases.

Journal ArticleDOI
TL;DR: In this article, a Bayes measure of discordance of an observation x is defined to be the distance between the posterior distributions of a parameter, in the presence or absence of x, and a measure of dissimilarity between two observations is also proposed.
Abstract: SUMMARY Given a set of observations xl, ..., xn, a Bayes measure of discordance of an observation x is defined to be the distance between the posterior distributions of a parameter, in the presence or absence of x. A measure of dissimilarity between two observations is also proposed. For large numbers of observations, these two measures may be approximated by simple functions of the log likelihood, thereby avoiding dependence on prior distributions. The theory is applied to data in which 13 judges ranked 20 mothers; each judge is supposed to give an independent observation on the mothers, with the analysis showing which judges are discordant from the rest and which judges are similar to each other.



Journal ArticleDOI
TL;DR: In this paper, an expository account in terms of moment space dependence is given of the Bayes estimate of a random probability relative to squared difference loss, from an observable $X$ which given the first $N$ moments is conditionally binomial.
Abstract: Let $N$ be a positive integer. In Section 2 an expository account in terms of moment space dependence is given of the Bayes estimate of a random probability $\Theta$, relative to squared difference loss, from an observable $X$ which given $\Theta$ is conditionally binomial $(N, \Theta)$. The risk and Bayes envelope functional are also considered in these terms. In Section 3 an explicit formulation is given for the minimax estimate of $\Theta$ when its first $N$ moments are known. Theorem 2 characterizes the condition that a Bayes estimate have constant risk over the class of all "priors" which yield these moments. In Section 4, a transformation is introduced which puts the interior of the space of the first $N$ moments for distributions on $\lbrack 0, 1\rbrack$ in one-one correspondence with the interior of the $N$-dimensional unit cube. This transformation is used to show that the supremum of the difference between minimax and Bayes risks over the class of all prior distributions is bounded above by $2^{-N}$. Examples for $N = 1, 2$, and 3 in terms of the above transformation are considered in Section 5.

22 Jan 1968
TL;DR: An adaptive, multicategory, pattern classification system for classifying statistical patterns is formulated that finds application in those instances when the probability densities and a priori probabilities of occurrence of the classes are unknown.
Abstract: : An adaptive, multicategory, pattern classification system for classifying statistical patterns is formulated. The system finds application in those instances when the probability densities and a priori probabilities of occurrence of the classes are unknown. The convergence rate and other special properties of the system are examined, including the special case where the expected loss due to misclassification by the system tends to the minimum expected loss which results when using the Bayes discriminant functions. In addition, a simulation of the system for a three-category problem using quadratic discriminant functions is presented. (Author)

Journal ArticleDOI
David Middleton1
TL;DR: In this paper, the usual Bayes analysis for optimum detection systems is extended to include the costs of equipment, location, maintenance, and other features of operation, in addition to the customary preassigned costs of correct and incorrect decisions, when multiple receiving sites (Q) and sensors (M) are employed for acquisition and ultimately joint processing of data for simple binary (i.e., yes orno) decisions as to the presence or absence of a signal source.
Abstract: The usual Bayes analysis for optimum detection systems is extended to include the costs of equipment, location, maintenance, and other features of operation, in addition to the customary preassigned costs of correct and incorrect decisions, when multiple receiving sites (Q) and sensors (M) are employed for acquisition and ultimately joint processing of data for simple binary (i.e.,yes orno) decisions as to the presence or absence of a signal source. A risk formalism is constructed which indicates how the expected overall costs of both decision and operation can be generally determined for a variety of realistic cost models. Since the expected cost or average risk of decision is a monotonically decreasing function ofM andQ, while the associated costs of operation are reasonably described by monotonically increasing functions ofM andQ, values ofM andQ may also exist for which the total average cost (or risk) can be minimized, as well as the cost of decision itself (i.e., for a Bayes decision system). In any case, the physics of the particular detection situation (radar, radio, seismic, acoustic, etc.) is included in the usual way, under the important constraint of decision optimality (i.e., Bayes decision). Thus, one purpose of this introductory study is to provide some possible models for system evaluation and comparison which specifically include the often controlling factors of operational costs (vis-a-vis those of decision). A simple analytic example is used to illustrate the approach, which is, however, capable of handling much more general examples, both conceptually and quantitatively. In these more general cases, it is expected that computer aids will be needed.

Journal ArticleDOI
01 Dec 1968
TL;DR: The cost analysis of Teichmann is extended to include frequency dependent and diversity dependent rain losses and a Bayes strategy of minimizing expected satellite system cost per information bit suggests that frequencies at or above K band can economically be considered.
Abstract: The cost analysis of Teichmann is extended to include frequency dependent and diversity dependent rain losses. A Bayes strategy of minimizing expected satellite system cost per information bit then suggests that frequencies at or above K band can economically be considered.

Journal ArticleDOI
TL;DR: It is shown that, as the classification process unfolds, any updating scheme that causes the Bayes classifier ultimately to learn the true values of the conditional probabilities also minimizes the expected processing time in the second stage.
Abstract: A 2-stage classification model is presented in which the first stage is a quick, computerized Bayes rule decision device, and the second a slow, but perfectly accurate, classifier. A stationary stream of elements or objects to be classified into one of several mutually exclusive categories is fed into the model. The conditional probabilities associated with the Bayes device are assumed unknown at the outset, except up to an initial probability distribution. The a posteriori probabilities from the first stage are treated as information that can speed up or slow down the processing time in the second stage. The latter, after a delay time, feeds back accurate classification information to the first stage to update the conditional probabilities. It is shown that, as the classification process unfolds, any updating scheme that causes the Bayes classifier ultimately to learn the true values of the conditional probabilities also minimizes the expected processing time in the second stage. The learning rate of the system is discussed as a function of the updating scheme. An example of a simple system is presented and the learning rate is derived specifically for that case.


Journal ArticleDOI
TL;DR: In this article, the authors studied relations among the class B of the Bayes solutions in the strict sense, the class W of the Besolutions in the wide sense, and the closure c(B), in a certain sense, of class B.
Abstract: In this paper we shall study relations among the class B of the Bayes solutions in the strict sense, the class W of the Bayes solutions in the wide sense and the closure c(B), in a certain sense, of the class B. For abbreviation, we shall use the word “Bayes class” for the class B and “Wald class” for the class W.


Proceedings ArticleDOI
01 Dec 1968
TL;DR: In this article, the authors considered the empirical Bayes decision problem with uncertain transition probabilities and pdf's and showed file convergence of the risk corresponding to t? to the optimal risk, without requiring that the signal to noise ratio converge to zero.
Abstract: The empirical Bayes decision problem is considered. Let {ai}, i=1,..,N be a sequence of Markov dependent random variables, ai? {1,...,m} where ai denotes the category of the ith sample also called the state of nature. Let pl j be the elements of the transition matrix of the Markov process and consider that the initial probabilities are equal to the steady state probabilities of the Markov chain. Let xN = {x1,...,xN} be a sequence of random observations where xi has probability density function fa i(xi). Suppose that the receiver does not know which state of nature is acting after the reception of the sample xi and after N observations, it is desired to partition the received samples into m sets with minimum probability of misclassification with respect to the true partition induced by the states of nature. Such a problem may arise in recognition of written characters Ref. [4] and in receiving signals over a noisy channel with intersymbol interference Ref. [3]. In the present work it is assumed that the fj(xi) are unknown, it is only known that fj(xi) belong to a family F of pdf's with known functional form. It is assumed that the probability transition matrix of the Markov chain is unknown. It is shown that if the family F of pdf's satisfies certain identifiability and differentiality conditions, then by using moment estimates of the unknown quantities, a decision function t? can be determined such that the corresponding risk converges to the optimal Bayes risk. The present work extends the results obtained in Ref. [4] by considering that the transition probabilities and the pdf's are unknown. The work of Ref. [3] is extended by showing file convergence of the risk corresponding to t? to the optimal risk, without requiring that the signal to noise ratio converge to zero.

Proceedings ArticleDOI
01 Dec 1968
TL;DR: A recursive Bayes optimal solution is found for the problem of sequential, multicategory pattern recognition, when unsupervised learning is required and is shown to be realizable in recursive form with fixed memory requirements.
Abstract: A recursive Bayes optimal solution is found for the problem of sequential, multicategory pattern recognition, when unsupervised learning is required. An unknown parameter model is developed which, for the pattern classification problem, allows for (i) both constant and time-varying unknown parameters, (ii)partially unknown probability laws of the hypotheses and time-varying parameter sequences, (iii) dependence of the observations on past as well as present hypotheses and parameters, and most significantly, (iv) sequential dependencies in the observations arising from either (or both) dependency in the pattern or information source (context dependence) or in the observation medium (sequential measurement correlation), these dependencies being up to any finite Markov orders. For finite parameter spaces, the solution which is Bayes optimal (minimum risk) at each step is found and shown to be realizable in recursive form with fixed memory requirements. The asymptotic properties of the optimal solution are studied and conditions established for the solution (in addition to making best use of available data at each step) to converge in performance to operation with knowledge of the (unobservable constant unknown parameters.