scispace - formally typeset
Search or ask a question
Book

Statistical Decision Theory and Bayesian Analysis

22 Dec 2012-
TL;DR: An overview of statistical decision theory, which emphasizes the use and application of the philosophical ideas and mathematical structure of decision theory.
Abstract: 1. Basic concepts 2. Utility and loss 3. Prior information and subjective probability 4. Bayesian analysis 5. Minimax analysis 6. Invariance 7. Preposterior and sequential analysis 8. Complete and essentially complete classes Appendices.
Citations
More filters
Journal ArticleDOI
TL;DR: It is shown how the proposed bidirectional structure can be easily modified to allow efficient estimation of the conditional posterior probability of complete symbol sequences without making any explicit assumption about the shape of the distribution.
Abstract: In the first part of this paper, a regular recurrent neural network (RNN) is extended to a bidirectional recurrent neural network (BRNN). The BRNN can be trained without the limitation of using input information just up to a preset future frame. This is accomplished by training it simultaneously in positive and negative time direction. Structure and training procedure of the proposed network are explained. In regression and classification experiments on artificial data, the proposed structure gives better results than other approaches. For real data, classification experiments for phonemes from the TIMIT database show the same tendency. In the second part of this paper, it is shown how the proposed bidirectional structure can be easily modified to allow efficient estimation of the conditional posterior probability of complete symbol sequences without making any explicit assumption about the shape of the distribution. For this part, experiments on real data are reported.

7,290 citations


Additional excerpts

  • ...These merging procedures are referred to as linear opinion pooling and logarithmic opinion pooling, respectively [1], [7]....

    [...]

Journal ArticleDOI
Michael E. Tipping1
TL;DR: It is demonstrated that by exploiting a probabilistic Bayesian learning framework, the 'relevance vector machine' (RVM) can derive accurate prediction models which typically utilise dramatically fewer basis functions than a comparable SVM while offering a number of additional advantages.
Abstract: This paper introduces a general Bayesian framework for obtaining sparse solutions to regression and classification tasks utilising models linear in the parameters Although this framework is fully general, we illustrate our approach with a particular specialisation that we denote the 'relevance vector machine' (RVM), a model of identical functional form to the popular and state-of-the-art 'support vector machine' (SVM) We demonstrate that by exploiting a probabilistic Bayesian learning framework, we can derive accurate prediction models which typically utilise dramatically fewer basis functions than a comparable SVM while offering a number of additional advantages These include the benefits of probabilistic predictions, automatic estimation of 'nuisance' parameters, and the facility to utilise arbitrary basis functions (eg non-'Mercer' kernels) We detail the Bayesian framework and associated learning algorithm for the RVM, and give some illustrative examples of its application along with some comparative benchmarks We offer some explanation for the exceptional degree of sparsity obtained, and discuss and demonstrate some of the advantageous features, and potential extensions, of Bayesian relevance learning

5,116 citations


Cites background or methods from "Statistical Decision Theory and Bay..."

  • ...In related Bayesian models, this quantity is known as the marginal likelihood, and its maximisation known as the type-II maximum likelihood method (Berger, 1985). The marginal likelihood is also referred to as the \evidence for the hyperparameters" by MacKay (1992a), and its maximisation as the \evidence procedure"....

    [...]

  • ...In related Bayesian models, this quantity is known as the marginal likelihood, and its maximisation known as the type-II maximum likelihood method (Berger, 1985)....

    [...]

Posted Content
TL;DR: E elegant connections between the concepts of Informedness, Markedness, Correlation and Significance as well as their intuitive relationships with Recall and Precision are demonstrated.
Abstract: Commonly used evaluation measures including Recall, Precision, F-Measure and Rand Accuracy are biased and should not be used without clear understanding of the biases, and corresponding identification of chance or base case levels of the statistic. Using these measures a system that performs worse in the objective sense of Informedness, can appear to perform better under any of these commonly used measures. We discuss several concepts and measures that reflect the probability that prediction is informed versus chance. Informedness and introduce Markedness as a dual measure for the probability that prediction is marked versus chance. Finally we demonstrate elegant connections between the concepts of Informedness, Markedness, Correlation and Significance as well as their intuitive relationships with Recall and Precision, and outline the extension from the dichotomous case to the general multi-class case.

5,092 citations


Cites background from "Statistical Decision Theory and Bay..."

  • ...See Sellke, Bayarri, and Berger (2001) for a comprehensive discussion on issues with significance testing, as well as Monte Carlo simulations....

    [...]

Journal ArticleDOI
TL;DR: In this article, the antecedents and consequences of customer satisfaction were investigated in a survey of 22,300 customers of a variety of major products and services in Sweden in 1989-1990.
Abstract: This research investigates the antecedents and consequences of customer satisfaction. We develop a model to link explicitly the antecedents and consequences of satisfaction in a utility-oriented framework. We estimate and test the model against alternative hypotheses from the satisfaction literature. In the process, a unique database is analyzed: a nationally representative survey of 22,300 customers of a variety of major products and services in Sweden in 1989-1990. Several well-known experimental findings of satisfaction research are tested in a field setting of national scope. For example, we find that satisfaction is best specified as a function of perceived quality and "disconfirmation"-the extent to which perceived quality fails to match prepurchase expectations. Surprisingly, expectations do not directly affect satisfaction, as is often suggested in the satisfaction literature. In addition, we find quality which falls short of expectations has a greater impact on satisfaction and repurchase intentions than quality which exceeds expectations. Moreover, we find that disconfirmation is more likely to occur when quality is easy to evaluate. Finally, in terms of systematic variation across firms, we find the elasticity of repurchase intentions with respect to satisfaction to be lower for firms that provide high satisfaction. This implies a long-run reputation effect insulating firms which consistently provide high satisfaction.

4,606 citations

BookDOI
31 Mar 2010
TL;DR: Semi-supervised learning (SSL) as discussed by the authors is the middle ground between supervised learning (in which all training examples are labeled) and unsupervised training (where no label data are given).
Abstract: In the field of machine learning, semi-supervised learning (SSL) occupies the middle ground, between supervised learning (in which all training examples are labeled) and unsupervised learning (in which no label data are given). Interest in SSL has increased in recent years, particularly because of application domains in which unlabeled data are plentiful, such as images, text, and bioinformatics. This first comprehensive overview of SSL presents state-of-the-art algorithms, a taxonomy of the field, selected applications, benchmark experiments, and perspectives on ongoing and future research. Semi-Supervised Learning first presents the key assumptions and ideas underlying the field: smoothness, cluster or low-density separation, manifold structure, and transduction. The core of the book is the presentation of SSL methods, organized according to algorithmic strategies. After an examination of generative models, the book describes algorithms that implement the low-density separation assumption, graph-based methods, and algorithms that perform two-step learning. The book then discusses SSL applications and offers guidelines for SSL practitioners by analyzing the results of extensive benchmark experiments. Finally, the book looks at interesting directions for SSL research. The book closes with a discussion of the relationship between semi-supervised learning and transduction. Adaptive Computation and Machine Learning series

3,773 citations