scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Support Vector Data Description

01 Jan 2004-Machine Learning (Kluwer Academic Publishers)-Vol. 54, Iss: 1, pp 45-66
TL;DR: The Support Vector Data Description (SVDD) is presented which obtains a spherically shaped boundary around a dataset and analogous to the Support Vector Classifier it can be made flexible by using other kernel functions.
Abstract: Data domain description concerns the characterization of a data set. A good description covers all target data but includes no superfluous space. The boundary of a dataset can be used to detect novel data or outliers. We will present the Support Vector Data Description (SVDD) which is inspired by the Support Vector Classifier. It obtains a spherically shaped boundary around a dataset and analogous to the Support Vector Classifier it can be made flexible by using other kernel functions. The method is made robust against outliers in the training set and is capable of tightening the description by using negative examples. We show characteristics of the Support Vector Data Descriptions using artificial and real data.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: This survey tries to provide a structured and comprehensive overview of the research on anomaly detection by grouping existing techniques into different categories based on the underlying approach adopted by each technique.
Abstract: Anomaly detection is an important problem that has been researched within diverse research areas and application domains. Many anomaly detection techniques have been specifically developed for certain application domains, while others are more generic. This survey tries to provide a structured and comprehensive overview of the research on anomaly detection. We have grouped existing techniques into different categories based on the underlying approach adopted by each technique. For each category we have identified key assumptions, which are used by the techniques to differentiate between normal and anomalous behavior. When applying a given technique to a particular domain, these assumptions can be used as guidelines to assess the effectiveness of the technique in that domain. For each category, we provide a basic anomaly detection technique, and then show how the different existing techniques in that category are variants of the basic technique. This template provides an easier and more succinct understanding of the techniques belonging to each category. Further, for each category, we identify the advantages and disadvantages of the techniques in that category. We also provide a discussion on the computational complexity of the techniques since it is an important issue in real application domains. We hope that this survey will provide a better understanding of the different directions in which research has been done on this topic, and how techniques developed in one area can be applied in domains for which they were not intended to begin with.

9,627 citations

Journal ArticleDOI
25 Jan 2010-Analyst
TL;DR: The increasing interest in Support Vector Machines (SVMs) over the past 15 years is described, including its application to multivariate calibration, and why it is useful when there are outliers and non-linearities.
Abstract: The increasing interest in Support Vector Machines (SVMs) over the past 15 years is described. Methods are illustrated using simulated case studies, and 4 experimental case studies, namely mass spectrometry for studying pollution, near infrared analysis of food, thermal analysis of polymers and UV/visible spectroscopy of polyaromatic hydrocarbons. The basis of SVMs as two-class classifiers is shown with extensive visualisation, including learning machines, kernels and penalty functions. The influence of the penalty error and radial basis function radius on the model is illustrated. Multiclass implementations including one vs. all, one vs. one, fuzzy rules and Directed Acyclic Graph (DAG) trees are described. One-class Support Vector Domain Description (SVDD) is described and contrasted to conventional two- or multi-class classifiers. The use of Support Vector Regression (SVR) is illustrated including its application to multivariate calibration, and why it is useful when there are outliers and non-linearities.

1,899 citations

Journal ArticleDOI
TL;DR: This article proposes a method called Isolation Forest (iForest), which detects anomalies purely based on the concept of isolation without employing any distance or density measure---fundamentally different from all existing methods.
Abstract: Anomalies are data points that are few and different. As a result of these properties, we show that, anomalies are susceptible to a mechanism called isolation. This article proposes a method called Isolation Forest (iForest), which detects anomalies purely based on the concept of isolation without employing any distance or density measure---fundamentally different from all existing methods.As a result, iForest is able to exploit subsampling (i) to achieve a low linear time-complexity and a small memory-requirement and (ii) to deal with the effects of swamping and masking effectively. Our empirical evaluation shows that iForest outperforms ORCA, one-class SVM, LOF and Random Forests in terms of AUC, processing time, and it is robust against masking and swamping effects. iForest also works well in high dimensional problems containing a large number of irrelevant attributes, and when anomalies are not available in training sample.

1,266 citations

Proceedings ArticleDOI
24 Aug 2008
TL;DR: This paper shows that models trained using the new methods perform better than the current state-of-the-art biased SVM method for learning from positive and unlabeled examples, and applies them to solve a real-world problem: identifying protein records that should be included in an incomplete specialized molecular biology database.
Abstract: The input to an algorithm that learns a binary classifier normally consists of two sets of examples, where one set consists of positive examples of the concept to be learned, and the other set consists of negative examples. However, it is often the case that the available training data are an incomplete set of positive examples, and a set of unlabeled examples, some of which are positive and some of which are negative. The problem solved in this paper is how to learn a standard binary classifier given a nontraditional training set of this nature.Under the assumption that the labeled examples are selected randomly from the positive examples, we show that a classifier trained on positive and unlabeled examples predicts probabilities that differ by only a constant factor from the true conditional probabilities of being positive. We show how to use this result in two different ways to learn a classifier from a nontraditional training set. We then apply these two new methods to solve a real-world problem: identifying protein records that should be included in an incomplete specialized molecular biology database. Our experiments in this domain show that models trained using the new methods perform better than the current state-of-the-art biased SVM method for learning from positive and unlabeled examples.

1,007 citations

Journal ArticleDOI
TL;DR: A hybrid model where an unsupervised DBN is trained to extract generic underlying features, and a one-class SVM is trained from the features learned by the DBN, which delivers a comparable accuracy with a deep autoencoder and is scalable and computationally efficient.

876 citations


Cites methods from "Support Vector Data Description"

  • ...For the explanation below, two of the most common 1SVM algorithms are chosen, a hypersphere-based 1SVM (known as Support Vector Data Description (SVDD)) by Tax and Duin [8], and a Planebased 1SVM (PSVM) by Scholkopf et al. [31], see Fig....

    [...]

  • ...The parameters of SVM based methods are selected via a grid-search, width ν ð0 1Þ, and σ ð1 1Þ for SVDD [8], and γ ð2 (15);2 (13);....

    [...]

  • ...For the explanation below, two of the most common 1SVM algorithms are chosen, a hypersphere-based 1SVM (known as Support Vector Data Description (SVDD)) by Tax and Duin [8], and a Planebased 1SVM (PSVM) by Scholkopf et al....

    [...]

  • ...This is extended from the hypersphere-based one-class SVM approach proposed by Tax and Duin [8]....

    [...]

  • ...Further, Tax and Duin have shown that the hyperplane-based one-class SVM becomes a special case of the (equivalent) hypersphere-based scheme when used with a radial basis kernel....

    [...]

References
More filters
Book
Vladimir Vapnik1
01 Jan 1995
TL;DR: Setting of the learning problem consistency of learning processes bounds on the rate of convergence ofLearning processes controlling the generalization ability of learning process constructing learning algorithms what is important in learning theory?
Abstract: Setting of the learning problem consistency of learning processes bounds on the rate of convergence of learning processes controlling the generalization ability of learning processes constructing learning algorithms what is important in learning theory?.

40,147 citations


"Support Vector Data Description" refers methods in this paper

  • ...This is identical to the approach which is used in Schölkopf, Burges, and Vapnik (1995) to estimate the VC-dimension of a classifier (which is bounded by the diameter of the smallest sphere enclosing the data)....

    [...]

01 Jan 1998
TL;DR: Presenting a method for determining the necessary and sufficient conditions for consistency of learning process, the author covers function estimates from small data pools, applying these estimations to real-life problems, and much more.
Abstract: A comprehensive look at learning and generalization theory. The statistical theory of learning and generalization concerns the problem of choosing desired functions on the basis of empirical data. Highly applicable to a variety of computer science and robotics fields, this book offers lucid coverage of the theory as a whole. Presenting a method for determining the necessary and sufficient conditions for consistency of learning process, the author covers function estimates from small data pools, applying these estimations to real-life problems, and much more.

26,531 citations


"Support Vector Data Description" refers background or methods in this paper

  • ...Several kernel functions have been proposed for the Support Vector Classifier (Vapnik, 1998; Smola, Schölkopf, & Müller, 1998)....

    [...]

  • ...In contrast to the Support Vector Classifier, the Support Vector Data Description using a polynomial kernel suffers from the large influence of the norms of the object vectors, but it shows promising results for the Gaussian kernel....

    [...]

  • ...For that the notion of essential support vectors has to be introduced (Vapnik, 1998)....

    [...]

  • ...Vapnik argued that in order to solve a problem, one should not try to solve a more general problem as an intermediate step ( Vapnik, 1998 )....

    [...]

  • ...The classifiers are Gaussian-density based linear classifier (called Bayes), Parzen classifier and the Support Vector Classifier with polynomial kernel, degree 3....

    [...]

Book
01 Jan 1995
TL;DR: This is the first comprehensive treatment of feed-forward neural networks from the perspective of statistical pattern recognition, and is designed as a text, with over 100 exercises, to benefit anyone involved in the fields of neural computation and pattern recognition.
Abstract: From the Publisher: This is the first comprehensive treatment of feed-forward neural networks from the perspective of statistical pattern recognition. After introducing the basic concepts, the book examines techniques for modelling probability density functions and the properties and merits of the multi-layer perceptron and radial basis function network models. Also covered are various forms of error functions, principal algorithms for error function minimalization, learning and generalization in neural networks, and Bayesian techniques and their applications. Designed as a text, with over 100 exercises, this fully up-to-date work will benefit anyone involved in the fields of neural computation and pattern recognition.

19,056 citations

Book ChapterDOI
TL;DR: The chapter discusses two important directions of research to improve learning algorithms: the dynamic node generation, which is used by the cascade correlation algorithm; and designing learning algorithms where the choice of parameters is not an issue.
Abstract: Publisher Summary This chapter provides an account of different neural network architectures for pattern recognition. A neural network consists of several simple processing elements called neurons. Each neuron is connected to some other neurons and possibly to the input nodes. Neural networks provide a simple computing paradigm to perform complex recognition tasks in real time. The chapter categorizes neural networks into three types: single-layer networks, multilayer feedforward networks, and feedback networks. It discusses the gradient descent and the relaxation method as the two underlying mathematical themes for deriving learning algorithms. A lot of research activity is centered on learning algorithms because of their fundamental importance in neural networks. The chapter discusses two important directions of research to improve learning algorithms: the dynamic node generation, which is used by the cascade correlation algorithm; and designing learning algorithms where the choice of parameters is not an issue. It closes with the discussion of performance and implementation issues.

13,033 citations


"Support Vector Data Description" refers background or methods in this paper

  • ...Neural networks, for instance, can be trained to estimate posterior probabilities (Richard & Lippmann, 1991; Bishop, 1995; Ripley, 1996) and tend to give high confidence outputs for objects which are remote from the training set....

    [...]

  • ...The third method is a Mixture of Gaussians, optimized using EM (Bishop, 1995)....

    [...]

  • ...By applying Leave-One-Out estimation (Vapnik, 1998; Bishop, 1995), it can be shown that the number of support vectors is an indication of the expected error made on the target set....

    [...]

  • ...In classification or regression problems a more advanced Bayesian approach can be used for detecting outliers (Bishop, 1995; MacKay, 1992; Roberts & Penny, 1996)....

    [...]

  • ...Keywords: outlier detection, novelty detection, one-class classification, support vector classifier, support vector data description...

    [...]