scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Pattern Recognition and Machine Learning

01 Aug 2007-Technometrics (Taylor & Francis)-Vol. 49, Iss: 3, pp 366-366
TL;DR: This book covers a broad range of topics for regular factorial designs and presents all of the material in very mathematical fashion and will surely become an invaluable resource for researchers and graduate students doing research in the design of factorial experiments.
Abstract: (2007). Pattern Recognition and Machine Learning. Technometrics: Vol. 49, No. 3, pp. 366-366.
Citations
More filters
Journal ArticleDOI
TL;DR: The integrated biomarker model shows that data associated with both emotional and reward processing are essential for a highly accurate classification of depression, and can provide ahighly accurate algorithm for classification.
Abstract: Context: Although psychiatric disorders are, to date, diagnosed on the basis of behavioral symptoms and course of illness, the interest in neurobiological markers of psychiatric disorders has grown substantially in recent years. However, current classification approaches are mainly based on data from a single biomarker, making it difficult to predict disorders characterized by complex patterns of symptoms. Objective: To integrate neuroimaging data associated with multiple symptom-related neural processes and demonstrate their utility in the context of depression by deriving a predictive model of brain activation. Design: Two groups of participants underwent functional magnetic resonance imaging during 3 tasks probing neural processes relevant to depression. Setting: Participants were recruited from the local population by use of advertisements; participants with depression were inpatients from the Department of Psychiatry, Psychosomatics, and Psychotherapy at the University of Wuerzburg, Wuerzburg, Germany. Participants: We matched a sample of 30 medicated, unselected patients with depression by age, sex, smoking status, and handedness with 30 healthy volunteers. Main Outcome Measure: Accuracy of single-subject classification based on whole-brain patterns of neural responses from all 3 tasks. Results: Integrating data associated with emotional and affective processing substantially increases classification accuracy compared with single classifiers. The predictive model identifies a combination of neural responses to neutral faces, large rewards, and safety cues as nonredundant predictors of depression. Regions of the brain associated with overall classification comprise a complex pattern of areas involved in emotional processing and the analysis of stimulus features. Conclusions: Our method of integrating neuroimaging data associated with multiple, symptom-related neural processes can provide a highly accurate algorithm for classification. The integrated biomarker model shows that data associated with both emotional and reward processing are essential for a highly accurate classification of depression. In the future, large-scale studies will need to be conducted to determine the practical applicability of our algorithm as a biomarker-based diagnostic aid.

148 citations


Cites methods from "Pattern Recognition and Machine Lea..."

  • ...Using these regularities, a computer can classify data into different categories.(5) In the context of neuroimaging, brain images are treated as spatial patterns and pattern-recognition approaches are used to identify statistical properties of the data that discriminate beAuthor Affiliations are listed at...

    [...]

Proceedings ArticleDOI
12 Nov 2012
TL;DR: Nonlinear Kalman filter and Rauch-Tung-Striebel smoother type recursive estimators for nonlinear discrete-time state space models with multivariate Student's t-distributed measurement noise are presented.
Abstract: Nonlinear Kalman filter and Rauch-Tung-Striebel smoother type recursive estimators for nonlinear discrete-time state space models with multivariate Student's t-distributed measurement noise are presented. The methods approximate the posterior state at each time step using the variational Bayes method. The nonlinearities in the dynamic and measurement models are handled using the nonlinear Gaussian filtering and smoothing approach, which encompasses many known nonlinear Kalman-type filters. The method is compared to alternative methods in a computer simulation.

148 citations

Journal ArticleDOI
TL;DR: Support vector regression is used as a learning method for anomaly detection from water flow and pressure time series data and the robustness derives from the training error function is applied to a case study.
Abstract: The sampling frequency and quantity of time series data collected from water distribution systems has been increasing in recent years, giving rise to the potential for improving system knowledge if suitable automated techniques can be applied, in particular, machine learning. Novelty (or anomaly) detection refers to the automatic identification of novel or abnormal patterns embedded in large amounts of “normal” data. When dealing with time series data (transformed into vectors), this means abnormal events embedded amongst many normal time series points. The support vector machine is a data-driven statistical technique that has been developed as a tool for classification and regression. The key features include statistical robustness with respect to non-Gaussian errors and outliers, the selection of the decision boundary in a principled way, and the introduction of nonlinearity in the feature space without explicitly requiring a nonlinear algorithm by means of kernel functions. In this research, support vector regression is used as a learning method for anomaly detection from water flow and pressure time series data. No use is made of past event histories collected through other information sources. The support vector regression methodology, whose robustness derives from the training error function, is applied to a case study.

148 citations

Journal ArticleDOI
TL;DR: In this article, the authors develop strategies for mean-field variational Bayes approximate inference for Bayesian hierarchical models containing elaborate distributions, such as Asymmetric Laplace, Skew Normal and Generalized Ex-tree Value distributions.
Abstract: We develop strategies for mean eld variational Bayes approximate inference for Bayesian hierarchical models containing elaborate distributions. We loosely dene elaborate distributions to be those having more complicated forms compared with common distributions such as those in the Normal and Gamma families. Examples are Asymmetric Laplace, Skew Normal and Generalized Ex- treme Value distributions. Such models suer from the diculty that the param- eter updates do not admit closed form solutions. We circumvent this problem through a combination of (a) specially tailored auxiliary variables, (b) univariate quadrature schemes and (c) nite mixture approximations of troublesome den-

148 citations


Cites background or methods from "Pattern Recognition and Machine Lea..."

  • ...Summaries may be found in, for example, Chapter 10 of Bishop (2006) and Ormerod and Wand (2010)....

    [...]

  • ..., Bishop 2006, Section 10.2.5), that different product restrictions lead to identical MFVB approximations. In keeping with the notational conventions declared in Section 2.2 we will, from now on, suppress the subscripts on the q density functions. The MFVB solutions can be shown to satisfy q(θi) ∝ exp{Eq(θ−i)logp(θi|x,θ−i)}, 1 ≤ i ≤ 6, (7) where θ−i denotes the set {θ1, . . . , θ6} with θi excluded. Note that the expectation operator Eq(θ−i) depends on the particular product density form being assumed. The optimal parameters in these q density functions can be determined by an iterative coordinate ascent scheme induced by (7) aimed at maximizing the lower bound on the marginal log-likelihood: logp(x; q) ≡ Eq(θ){logp(x,θ)− logq(θ)} ≤ logp(x). If it is assumed that each iteration entails unique maximization of logp(x; q) with respect to the current θi, and that the search is restricted to a compact set, then convergence to a local maximizer of logp(x; q) is guaranteed (Luenberger and Ye 2008, p. 253). Successive values of logp(x; q) can be used to monitor convergence. At convergence q(θi), 1 ≤ i ≤ 6, and logp(x; q) are, respectively, the minimum Kullback-Leibler approximations to the posterior densities p(θi|x), 1 ≤ i ≤ 6, and the marginal log-likelihood logp(x). The extension to general Bayesian models with arbitrary parameter vectors and latent variables is straightforward. Summaries may be found in, for example, Chapter 10 of Bishop (2006) and Ormerod and Wand (2010). As described in these references, directed acyclic graph (DAG) representations of Bayesian hierarchical models are very...

    [...]

  • ...It is also possible, due to the notion of induced factorizations (e.g., Bishop 2006, Section 10.2.5), that different product restrictions lead to identical MFVB approximations....

    [...]

  • ...This result is very well-known and forms the basis of normal mixture fitting via the Expectation-Maximization algorithm (e.g., Bishop 2006)....

    [...]

Dissertation
24 Sep 2009
TL;DR: This thesis describes a method called “affinity propagation” that simultaneously considers all data points as potential exemplars, exchanging real-valued messages between data points until a high-quality set of exemplars and corresponding clusters gradually emerges.
Abstract: AFFINITY PROPAGATION: CLUSTERING DATA BY PASSING MESSAGES Delbert Dueck Doctor of Philosophy Graduate Department of Electrical & Computer Engineering University of Toronto 2009 Clustering data by identifying a subset of representative examples is important for detecting patterns in data and in processing sensory signals. Such “exemplars” can be found by randomly choosing an initial subset of data points as exemplars and then iteratively refining it, but this works well only if that initial choice is close to a good solution. This thesis describes a method called “affinity propagation” that simultaneously considers all data points as potential exemplars, exchanging real-valued messages between data points until a high-quality set of exemplars and corresponding clusters gradually emerges. Affinity propagation takes as input a set of pairwise similarities between data points and finds clusters on the basis of maximizing the total similarity between data points and their exemplars. Similarity can be simply defined as negative squared Euclidean distance for compatibility with other algorithms, or it can incorporate richer domain-specific models ( e.g., translation-invariant distances for comparing images). Affinity propagation’s computational and memory requirements scale linearly with the number of similarities input; for non-sparse problems where all possible similarities are computed, these requirements scale quadratically with the number of data points. Affinity propagation is demonstrated on several applications from areas such as computer vision and bioinformatics, and it typically finds better clustering solutions than other methods in less time.

148 citations