scispace - formally typeset
Search or ask a question
Journal ArticleDOI

LIBSVM: A library for support vector machines

TL;DR: Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.
Abstract: LIBSVM is a library for Support Vector Machines (SVMs). We have been actively developing this package since the year 2000. The goal is to help users to easily apply SVM to their applications. LIBSVM has gained wide popularity in machine learning and many other areas. In this article, we present all implementation details of LIBSVM. Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.

Content maybe subject to copyright    Report

Citations
More filters
Patent
31 Dec 2010
TL;DR: In this article, a method for detecting covert malware in a computing environment is provided, the method comprising: generating simulated user activity outside of the computing environment; conveying the simulated user activities to an application inside the environment; and determining whether a decoy corresponding to the simulated users' activity has been accessed by an unauthorized entity.
Abstract: Methods, systems, and media for detecting covert malware are provided. In accordance with some embodiments, a method for detecting covert malware in a computing environment is provided, the method comprising: generating simulated user activity outside of the computing environment; conveying the simulated user activity to an application inside the computing environment; and determining whether a decoy corresponding to the simulated user activity has been accessed by an unauthorized entity.

284 citations

Journal ArticleDOI
TL;DR: This work proposes a novel neighborhood similarity measure, which explores the local sample and label distributions and shows that the neighborhood similarity between two samples simultaneously takes into account three characteristics: their distance; the distribution difference of the surrounding samples; and the distribution different of surrounding labels.
Abstract: In the past few years, video annotation has benefited a lot from the progress of machine learning techniques. Recently, graph-based semi-supervised learning has gained much attention in this domain. However, as a crucial factor of these algorithms, the estimation of pairwise similarity has not been sufficiently studied. Generally, the similarity of two samples is estimated based on the Euclidean distance between them. But we will show that the similarity between two samples is not merely related to their distance but also related to the distribution of surrounding samples and labels. It is shown that the traditional distance-based similarity measure may lead to high classification error rates even on several simple datasets. To address this issue, we propose a novel neighborhood similarity measure, which explores the local sample and label distributions. We show that the neighborhood similarity between two samples simultaneously takes into account three characteristics: 1) their distance; 2) the distribution difference of the surrounding samples; and 3) the distribution difference of surrounding labels. Extensive experiments have demonstrated the superiority of neighborhood similarity over the existing distance-based similarity.

284 citations

Journal ArticleDOI
TL;DR: The present study investigates the neural code of facial identity perception with the aim of ascertaining its distributed nature and informational basis, and uses a sequence of multivariate pattern analyses applied to functional magnetic resonance imaging (fMRI) data to map out and characterize a cortical system responsible for individuation.
Abstract: Face individuation is one of the most impressive achievements of our visual system, and yet uncovering the neural mechanisms subserving this feat appears to elude traditional approaches to functional brain data analysis. The present study investigates the neural code of facial identity perception with the aim of ascertaining its distributed nature and informational basis. To this end, we use a sequence of multivariate pattern analyses applied to functional magnetic resonance imaging (fMRI) data. First, we combine information-based brain mapping and dynamic discrimination analysis to locate spatiotemporal patterns that support face classification at the individual level. This analysis reveals a network of fusiform and anterior temporal areas that carry information about facial identity and provides evidence that the fusiform face area responds with distinct patterns of activation to different face identities. Second, we assess the information structure of the network using recursive feature elimination. We find that diagnostic information is distributed evenly among anterior regions of the mapped network and that a right anterior region of the fusiform gyrus plays a central role within the information network mediating face individuation. These findings serve to map out and characterize a cortical system responsible for individuation. More generally, in the context of functionally defined networks, they provide an account of distributed processing grounded in information-based architectures.

284 citations

Journal ArticleDOI
TL;DR: In this article, a new automated approach for recognition of physical progress based on two emerging sources of information: (1) unordered daily construction photo collections, which are currently collected at almost no cost on all construction sites; and (2) building information models (BIMs), which are increasingly turning into binding components of architecture/engineering/construction contracts.
Abstract: Accurate and efficient tracking, analysis and visualization of as-built (actual) status of buildings under construction are critical components of a successful project monitoring. Such information directly supports control decision-making and if automated, can significantly impact management of a project. This paper presents a new automated approach for recognition of physical progress based on two emerging sources of information: (1) unordered daily construction photo collections, which are currently collected at almost no cost on all construction sites; and (2) building information models (BIMs), which are increasingly turning into binding components of architecture/engineering/construction contracts. First, given a set of unordered and uncalibrated site photographs, an approach based on structure-from-motion, multiview stereo, and voxel coloring and labeling algorithms is presented that calibrates cameras, photorealistically reconstructs a dense as-built point cloud model in four dimensions (th...

283 citations

Journal ArticleDOI
TL;DR: Two machine learning approaches for the automatic classification of breast cancer histology images into benign and malignant and into benignand malignant sub-classes are compared.
Abstract: In recent years, the classification of breast cancer has been the topic of interest in the field of Healthcare informatics, because it is the second main cause of cancer-related deaths in women. Breast cancer can be identified using a biopsy where tissue is removed and studied under microscope. The diagnosis is based on the qualification of the histopathologist, who will look for abnormal cells. However, if the histopathologist is not well-trained, this may lead to wrong diagnosis. With the recent advances in image processing and machine learning, there is an interest in attempting to develop a reliable pattern recognition based systems to improve the quality of diagnosis. In this paper, we compare two machine learning approaches for the automatic classification of breast cancer histology images into benign and malignant and into benign and malignant sub-classes. The first approach is based on the extraction of a set of handcrafted features encoded by two coding models (bag of words and locality constrained linear coding) and trained by support vector machines, while the second approach is based on the design of convolutional neural networks. We have also experimentally tested dataset augmentation techniques to enhance the accuracy of the convolutional neural network as well as “handcrafted features + convolutional neural network” and “ convolutional neural network features + classifier” configurations. The results show convolutional neural networks outperformed the handcrafted feature based classifier, where we achieved accuracy between 96.15% and 98.33% for the binary classification and 83.31% and 88.23% for the multi-class classification.

282 citations


Cites methods from "LIBSVM: A library for support vecto..."

  • ...Libsvm [46] library is used to train and build the SVM....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: High generalization ability of support-vector networks utilizing polynomial input transformations is demonstrated and the performance of the support- vector network is compared to various classical learning algorithms that all took part in a benchmark study of Optical Character Recognition.
Abstract: The support-vector network is a new learning machine for two-group classification problems. The machine conceptually implements the following idea: input vectors are non-linearly mapped to a very high-dimension feature space. In this feature space a linear decision surface is constructed. Special properties of the decision surface ensures high generalization ability of the learning machine. The idea behind the support-vector network was previously implemented for the restricted case where the training data can be separated without errors. We here extend this result to non-separable training data. High generalization ability of support-vector networks utilizing polynomial input transformations is demonstrated. We also compare the performance of the support-vector network to various classical learning algorithms that all took part in a benchmark study of Optical Character Recognition.

37,861 citations


"LIBSVM: A library for support vecto..." refers background in this paper

  • ...{1,-1}, C-SVC [Boser et al. 1992; Cortes and Vapnik 1995] solves 4LIBSVM Tools: http://www.csie.ntu.edu.tw/~cjlin/libsvmtools. the following primal optimization problem: l t min 1 w T w +C .i (1) w,b,. 2 i=1 subject to yi(w T f(xi) +b) =1 -.i, .i =0,i =1,...,l, where f(xi)maps xi into a…...

    [...]

01 Jan 1998
TL;DR: Presenting a method for determining the necessary and sufficient conditions for consistency of learning process, the author covers function estimates from small data pools, applying these estimations to real-life problems, and much more.
Abstract: A comprehensive look at learning and generalization theory. The statistical theory of learning and generalization concerns the problem of choosing desired functions on the basis of empirical data. Highly applicable to a variety of computer science and robotics fields, this book offers lucid coverage of the theory as a whole. Presenting a method for determining the necessary and sufficient conditions for consistency of learning process, the author covers function estimates from small data pools, applying these estimations to real-life problems, and much more.

26,531 citations


"LIBSVM: A library for support vecto..." refers background in this paper

  • ...Under given parameters C > 0and E> 0, the standard form of support vector regression [Vapnik 1998] is ll tt 1 T min w w + C .i + C .i * w,b,.,. * 2 i=1 i=1 subject to w T f(xi) + b- zi = E + .i, zi - w T f(xi) - b = E + .i * , * .i,.i = 0,i = 1,...,l....

    [...]

  • ...It can be clearly seen that C-SVC and one-class SVM are already in the form of problem (11)....

    [...]

  • ..., l, in two classes, and a vector y ∈ Rl such that yi ∈ {1,−1}, C-SVC (Cortes and Vapnik, 1995; Vapnik, 1998) solves the following primal problem:...

    [...]

  • ...Then, according to the SVM formulation, svm train one calls a corresponding subroutine such as solve c svc for C-SVC and solve nu svc for ....

    [...]

  • ...Note that b of C-SVC and E-SVR plays the same role as -. in one-class SVM, so we de.ne ....

    [...]

Proceedings ArticleDOI
01 Jul 1992
TL;DR: A training algorithm that maximizes the margin between the training patterns and the decision boundary is presented, applicable to a wide variety of the classification functions, including Perceptrons, polynomials, and Radial Basis Functions.
Abstract: A training algorithm that maximizes the margin between the training patterns and the decision boundary is presented. The technique is applicable to a wide variety of the classification functions, including Perceptrons, polynomials, and Radial Basis Functions. The effective number of parameters is adjusted automatically to match the complexity of the problem. The solution is expressed as a linear combination of supporting patterns. These are the subset of training patterns that are closest to the decision boundary. Bounds on the generalization performance based on the leave-one-out method and the VC-dimension are given. Experimental results on optical character recognition problems demonstrate the good generalization obtained when compared with other learning algorithms.

11,211 citations


"LIBSVM: A library for support vecto..." refers background in this paper

  • ...It can be clearly seen that C-SVC and one-class SVM are already in the form of problem (11)....

    [...]

  • ...Then, according to the SVM formulation, svm train one calls a corresponding subroutine such as solve c svc for C-SVC and solve nu svc for ....

    [...]

  • ...Note that b of C-SVC and E-SVR plays the same role as -. in one-class SVM, so we de.ne ....

    [...]

  • ...In Section 2, we describe SVM formulations sup­ported in LIBSVM: C-Support Vector Classi.cation (C-SVC), ....

    [...]

  • ...{1,-1}, C-SVC [Boser et al. 1992; Cortes and Vapnik 1995] solves 4LIBSVM Tools: http://www.csie.ntu.edu.tw/~cjlin/libsvmtools. the following primal optimization problem: l t min 1 w T w +C .i (1) w,b,. 2 i=1 subject to yi(w T f(xi) +b) =1 -.i, .i =0,i =1,...,l, where f(xi)maps xi into a higher-dimensional space and C > 0 is the regularization parameter....

    [...]

01 Jan 2008
TL;DR: A simple procedure is proposed, which usually gives reasonable results and is suitable for beginners who are not familiar with SVM.
Abstract: Support vector machine (SVM) is a popular technique for classication. However, beginners who are not familiar with SVM often get unsatisfactory results since they miss some easy but signicant steps. In this guide, we propose a simple procedure, which usually gives reasonable results.

7,069 citations


"LIBSVM: A library for support vecto..." refers methods in this paper

  • ...A Simple Example of Running LIBSVM While detailed instructions of using LIBSVM are available in the README file of the package and the practical guide by Hsu et al. [2003], here we give a simple example....

    [...]

  • ...For instructions of using LIBSVM, see the README file included in the package, the LIBSVM FAQ,3 and the practical guide by Hsu et al. [2003]. LIBSVM supports the following learning tasks....

    [...]

Journal ArticleDOI
TL;DR: Decomposition implementations for two "all-together" multiclass SVM methods are given and it is shown that for large problems methods by considering all data at once in general need fewer support vectors.
Abstract: Support vector machines (SVMs) were originally designed for binary classification. How to effectively extend it for multiclass classification is still an ongoing research issue. Several methods have been proposed where typically we construct a multiclass classifier by combining several binary classifiers. Some authors also proposed methods that consider all classes at once. As it is computationally more expensive to solve multiclass problems, comparisons of these methods using large-scale problems have not been seriously conducted. Especially for methods solving multiclass SVM in one step, a much larger optimization problem is required so up to now experiments are limited to small data sets. In this paper we give decomposition implementations for two such "all-together" methods. We then compare their performance with three methods based on binary classifications: "one-against-all," "one-against-one," and directed acyclic graph SVM (DAGSVM). Our experiments indicate that the "one-against-one" and DAG methods are more suitable for practical use than the other methods. Results also show that for large problems methods by considering all data at once in general need fewer support vectors.

6,562 citations