scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Pattern Recognition and Machine Learning

01 Aug 2007-Technometrics (Taylor & Francis)-Vol. 49, Iss: 3, pp 366-366
TL;DR: This book covers a broad range of topics for regular factorial designs and presents all of the material in very mathematical fashion and will surely become an invaluable resource for researchers and graduate students doing research in the design of factorial experiments.
Abstract: (2007). Pattern Recognition and Machine Learning. Technometrics: Vol. 49, No. 3, pp. 366-366.
Citations
More filters
Posted Content
TL;DR: Stochastic regeneration linear runtime scaling in cases where many previous approaches scaled quadratically is shown, and how to use stochastic regeneration and the SPI to implement general-purpose inference strategies such as Metropolis-Hastings, Gibbs sampling, and blocked proposals based on particle Markov chain Monte Carlo and mean-field variational inference techniques are shown.
Abstract: We describe Venture, an interactive virtual machine for probabilistic programming that aims to be sufficiently expressive, extensible, and efficient for general-purpose use. Like Church, probabilistic models and inference problems in Venture are specified via a Turing-complete, higher-order probabilistic language descended from Lisp. Unlike Church, Venture also provides a compositional language for custom inference strategies built out of scalable exact and approximate techniques. We also describe four key aspects of Venture's implementation that build on ideas from probabilistic graphical models. First, we describe the stochastic procedure interface (SPI) that specifies and encapsulates primitive random variables. The SPI supports custom control flow, higher-order probabilistic procedures, partially exchangeable sequences and ``likelihood-free'' stochastic simulators. It also supports external models that do inference over latent variables hidden from Venture. Second, we describe probabilistic execution traces (PETs), which represent execution histories of Venture programs. PETs capture conditional dependencies, existential dependencies and exchangeable coupling. Third, we describe partitions of execution histories called scaffolds that factor global inference problems into coherent sub-problems. Finally, we describe a family of stochastic regeneration algorithms for efficiently modifying PET fragments contained within scaffolds. Stochastic regeneration linear runtime scaling in cases where many previous approaches scaled quadratically. We show how to use stochastic regeneration and the SPI to implement general-purpose inference strategies such as Metropolis-Hastings, Gibbs sampling, and blocked proposals based on particle Markov chain Monte Carlo and mean-field variational inference techniques.

237 citations


Cites methods from "Pattern Recognition and Machine Lea..."

  • ...Probabilistic modeling and approximate Bayesian inference have proven to be powerful tools in multiple fields, from machine learning (Bishop, 2006) and statistics (Green et al....

    [...]

  • ...Probabilistic modeling and approximate Bayesian inference have proven to be powerful tools in multiple fields, from machine learning (Bishop, 2006) and statistics (Green et al., 2003; Gelman et al., 1995) to robotics (Thrun et al., 2005), artificial intelligence (Russell and Norvig, 2002), and…...

    [...]

Proceedings Article
08 Dec 2014
TL;DR: It is shown how an EP based approach can also be used to train deterministic MNNs, and an analytical approximation to the Bayes update of this posterior is found, as well as the resulting Bayes estimates of the weights and outputs.
Abstract: Multilayer Neural Networks (MNNs) are commonly trained using gradient descent-based methods, such as BackPropagation (BP). Inference in probabilistic graphical models is often done using variational Bayes methods, such as Expectation Propagation (EP). We show how an EP based approach can also be used to train deterministic MNNs. Specifically, we approximate the posterior of the weights given the data using a "mean-field" factorized distribution, in an online setting. Using online EP and the central limit theorem we find an analytical approximation to the Bayes update of this posterior, as well as the resulting Bayes estimates of the weights and outputs. Despite a different origin, the resulting algorithm, Expectation BackPropagation (EBP), is very similar to BP in form and efficiency. However, it has several additional advantages: (1) Training is parameter-free, given initial conditions (prior) and the MNN architecture. This is useful for large-scale problems, where parameter tuning is a major challenge. (2) The weights can be restricted to have discrete values. This is especially useful for implementing trained MNNs in precision limited hardware chips, thus improving their speed and energy efficiency by several orders of magnitude. We test the EBP algorithm numerically in eight binary text classification tasks. In all tasks, EBP outperforms: (1) standard BP with the optimal constant learning rate (2) previously reported state of the art. Interestingly, EBP-trained MNNs with binary weights usually perform better than MNNs with continuous (real) weights - if we average the MNN output using the inferred posterior.

237 citations


Cites methods from "Pattern Recognition and Machine Lea..."

  • ...The factors P̂ (Wij,l|Dn) can be found using a standard variational approach [5, 24]....

    [...]

  • ...Previous EP [24, 22] and message passing [6, 1] (a special case of EP[5]) based methods were derived only for SNNs....

    [...]

Proceedings ArticleDOI
09 Feb 2011
TL;DR: This paper models the sentiment analysis of product reviews problem as a semi-supervised learning problem, and proposes a method to automatically identify some labeled examples that outperforms existing state-of-the-art methods.
Abstract: In sentiment analysis of product reviews, one important problem is to produce a summary of opinions based on product features/attributes (also called aspects). However, for the same feature, people can express it with many different words or phrases. To produce a useful summary, these words and phrases, which are domain synonyms, need to be grouped under the same feature group. Although several methods have been proposed to extract product features from reviews, limited work has been done on clustering or grouping of synonym features. This paper focuses on this task. Classic methods for solving this problem are based on unsupervised learning using some forms of distributional similarity. However, we found that these methods do not do well. We then model it as a semi-supervised learning problem. Lexical characteristics of the problem are exploited to automatically identify some labeled examples. Empirical evaluation shows that the proposed method outperforms existing state-of-the-art methods by a large margin.

236 citations


Cites methods from "Pattern Recognition and Machine Lea..."

  • ...Note that, the Kmeans algorithm corresponds to a particular non-probabilistic limit of EM applied to mixtures of Gaussians [4]....

    [...]

  • ...Note that, the Kmeans algorithm corresponds to a particular non-probabilistic limit of EM applied to mixtures of Gaussians [4]....

    [...]

Journal ArticleDOI
TL;DR: The study makes a claim and offers sound evidence behind the observation that higher fuzziness of a fuzzy classifier may imply better generalization aspects of the classifier, especially for classification data exhibiting complex boundaries.
Abstract: We investigate essential relationships between generalization capabilities and fuzziness of fuzzy classifiers (viz., the classifiers whose outputs are vectors of membership grades of a pattern to the individual classes). The study makes a claim and offers sound evidence behind the observation that higher fuzziness of a fuzzy classifier may imply better generalization aspects of the classifier, especially for classification data exhibiting complex boundaries. This observation is not intuitive with a commonly accepted position in “traditional” pattern recognition. The relationship that obeys the conditional maximum entropy principle is experimentally confirmed. Furthermore, the relationship can be explained by the fact that samples located close to classification boundaries are more difficult to be correctly classified than the samples positioned far from the boundaries. This relationship is expected to provide some guidelines as to the improvement of generalization aspects of fuzzy classifiers.

236 citations


Cites background from "Pattern Recognition and Machine Lea..."

  • ...For a more detailed description of SVM, see [40]....

    [...]

  • ...SVMs select a boundary according to the maximization of margin, which is based on the statistical learning theory [40]....

    [...]

Proceedings Article
01 Jan 2013
TL;DR: This paper proposes a technique to leverage topic based sentiments from Twitter to help predict the stock market by utilizing a con- tinuous Dirichlet Process Mixture model to learn the daily topic set and regress the stock index and the Twitter sentiment time series to predict the market.
Abstract: This paper proposes a technique to leverage topic based sentiments from Twitter to help predict the stock market. We first utilize a con- tinuous Dirichlet Process Mixture model to learn the daily topic set. Then, for each topic we derive its sentiment according to its opin- ion words distribution to build a sentiment time series. We then regress the stock index and the Twitter sentiment time series to predict the market. Experiments on real-life S&P100 Index show that our approach is effective and performs better than existing state-of-the-art non-topic based methods.

234 citations


Cites methods from "Pattern Recognition and Machine Lea..."

  • ...We use collapsed Gibbs sampling (Bishop, 2006) for model inference....

    [...]