scispace - formally typeset
Search or ask a question

Showing papers on "Support vector machine published in 2007"


Proceedings ArticleDOI
Rajat Raina1, Alexis Battle1, Honglak Lee1, Benjamin Packer1, Andrew Y. Ng1 
20 Jun 2007
TL;DR: An approach to self-taught learning that uses sparse coding to construct higher-level features using the unlabeled data to form a succinct input representation and significantly improve classification performance.
Abstract: We present a new machine learning framework called "self-taught learning" for using unlabeled data in supervised classification tasks. We do not assume that the unlabeled data follows the same class labels or generative distribution as the labeled data. Thus, we would like to use a large number of unlabeled images (or audio samples, or text documents) randomly downloaded from the Internet to improve performance on a given image (or audio, or text) classification task. Such unlabeled data is significantly easier to obtain than in typical semi-supervised or transfer learning settings, making self-taught learning widely applicable to many practical learning problems. We describe an approach to self-taught learning that uses sparse coding to construct higher-level features using the unlabeled data. These features form a succinct input representation and significantly improve classification performance. When using an SVM for classification, we further show how a Fisher kernel can be learned for this representation.

1,731 citations


Journal ArticleDOI
TL;DR: A binary SVM classifier that determines two nonparallel planes by solving two related SVM-type problems, each of which is smaller than in a conventional SVM, which shows good generalization on several benchmark data sets.
Abstract: We propose twin SVM, a binary SVM classifier that determines two nonparallel planes by solving two related SVM-type problems, each of which is smaller than in a conventional SVM. The twin SVM formulation is in the spirit of proximal SVMs via generalized eigenvalues. On several benchmark data sets, Twin SVM is not only fast, but shows good generalization. Twin SVM is also useful for automatically discovering two-dimensional projections of the data

1,501 citations


01 Jan 2007
TL;DR: An attempt has been made to review the existing theory, methods, recent developments and scopes of Support Vector Regression.
Abstract: Instead of minimizing the observed training error, Support Vector Regression (SVR) attempts to minimize the generalization error bound so as to achieve generalized performance. The idea of SVR is based on the computation of a linear regression function in a high dimensional feature space where the input data are mapped via a nonlinear function. SVR has been applied in various fields - time series and financial (noisy and risky) prediction, approximation of complex engineering analyses, convex quadratic programming and choices of loss functions, etc. In this paper, an attempt has been made to review the existing theory, methods, recent developments and scopes of SVR.

1,467 citations


Proceedings ArticleDOI
26 Dec 2007
TL;DR: It is shown that selecting the ROI adds about 5% to the performance and, together with the other improvements, the result is about a 10% improvement over the state of the art for Caltech-256.
Abstract: We explore the problem of classifying images by the object categories they contain in the case of a large number of object categories. To this end we combine three ingredients: (i) shape and appearance representations that support spatial pyramid matching over a region of interest. This generalizes the representation of Lazebnik et al., (2006) from an image to a region of interest (ROI), and from appearance (visual words) alone to appearance and local shape (edge distributions); (ii) automatic selection of the regions of interest in training. This provides a method of inhibiting background clutter and adding invariance to the object instance 's position; and (iii) the use of random forests (and random ferns) as a multi-way classifier. The advantage of such classifiers (over multi-way SVM for example) is the ease of training and testing. Results are reported for classification of the Caltech-101 and Caltech-256 data sets. We compare the performance of the random forest/ferns classifier with a benchmark multi-way SVM classifier. It is shown that selecting the ROI adds about 5% to the performance and, together with the other improvements, the result is about a 10% improvement over the state of the art for Caltech-256.

1,401 citations


Journal ArticleDOI
TL;DR: This paper presents a survey of machine condition monitoring and fault diagnosis using support vector machine (SVM), and attempts to summarize and review the recent research and developments of SVM in machine condition Monitoring and diagnosis.

1,228 citations


Proceedings ArticleDOI
20 Jun 2007
TL;DR: A simple and effective iterative algorithm for solving the optimization problem cast by Support Vector Machines that alternates between stochastic gradient descent steps and projection steps that can seamlessly be adapted to employ non-linear kernels while working solely on the primal objective function.
Abstract: We describe and analyze a simple and effective iterative algorithm for solving the optimization problem cast by Support Vector Machines (SVM). Our method alternates between stochastic gradient descent steps and projection steps. We prove that the number of iterations required to obtain a solution of accuracy e is O(1/e). In contrast, previous analyses of stochastic gradient descent methods require Ω (1/e2) iterations. As in previously devised SVM solvers, the number of iterations also scales linearly with 1/λ, where λ is the regularization parameter of SVM. For a linear kernel, the total run-time of our method is O (d/(λe)), where d is a bound on the number of non-zero features in each example. Since the run-time does not depend directly on the size of the training set, the resulting algorithm is especially suited for learning from large datasets. Our approach can seamlessly be adapted to employ non-linear kernels while working solely on the primal objective function. We demonstrate the efficiency and applicability of our approach by conducting experiments on large text classification problems, comparing our solver to existing state-of-the-art SVM solvers. For example, it takes less than 5 seconds for our solver to converge when solving a text classification problem from Reuters Corpus Volume 1 (RCV1) with 800,000 training examples.

985 citations


Book ChapterDOI
17 Sep 2007
TL;DR: This paper proposes an ensemble method for multilabel classification that aims to take into account label correlations using single-label classifiers that are applied on subtasks with manageable number of labels and adequate number of examples per label.
Abstract: This paper proposes an ensemble method for multilabel classification. The RAndom k-labELsets (RAKEL) algorithm constructs each member of the ensemble by considering a small random subset of labels and learning a single-label classifier for the prediction of each element in the powerset of this subset. In this way, the proposed algorithm aims to take into account label correlations using single-label classifiers that are applied on subtasks with manageable number of labels and adequate number of examples per label. Experimental results on common multilabel domains involving protein, document and scene classification show that better performance can be achieved compared to popular multilabel classification approaches.

876 citations


Journal ArticleDOI
TL;DR: It is pointed out that the primal problem can also be solved efficiently for both linear and nonlinear SVMs and that there is no reason for ignoring this possibility.
Abstract: Most literature on support vector machines (SVMs) concentrates on the dual optimization problem. In this letter, we point out that the primal problem can also be solved efficiently for both linear and nonlinear SVMs and that there is no reason for ignoring this possibility. On the contrary, from the primal point of view, new families of algorithms for large-scale SVM training can be investigated.

837 citations


Journal ArticleDOI
TL;DR: In this article, a simple Bayesian logistic regression approach that uses a Laplace prior to avoid overfitting and produces sparse predictive models for text data is presented. But this approach is not suitable for document classification problems.
Abstract: Logistic regression analysis of high-dimensional data, such as natural language text, poses computational and statistical challenges. Maximum likelihood estimation often fails in these applications. We present a simple Bayesian logistic regression approach that uses a Laplace prior to avoid overfitting and produces sparse predictive models for text data. We apply this approach to a range of document classification problems and show that it produces compact predictive models at least as effective as those produced by support vector machine classifiers or ridge logistic regression combined with feature selection. We describe our model fitting algorithm, our open source implementations (BBR and BMR), and experimental results.

829 citations


Journal ArticleDOI
TL;DR: In the framework of computer-aided diagnosis of eye diseases, retinal vessel segmentation based on line operators is proposed and two segmentation methods are considered.
Abstract: In the framework of computer-aided diagnosis of eye diseases, retinal vessel segmentation based on line operators is proposed. A line detector, previously used in mammography, is applied to the green channel of the retinal image. It is based on the evaluation of the average grey level along lines of fixed length passing through the target pixel at different orientations. Two segmentation methods are considered. The first uses the basic line detector whose response is thresholded to obtain unsupervised pixel classification. As a further development, we employ two orthogonal line detectors along with the grey level of the target pixel to construct a feature vector for supervised classification using a support vector machine. The effectiveness of both methods is demonstrated through receiver operating characteristic analysis on two publicly available databases of color fundus images.

819 citations


Journal ArticleDOI
TL;DR: Experimental results show that SVM is a promising addition to the existing data mining methods and three strategies to construct the hybrid SVM-based credit scoring models are used.
Abstract: The credit card industry has been growing rapidly recently, and thus huge numbers of consumers' credit data are collected by the credit department of the bank. The credit scoring manager often evaluates the consumer's credit with intuitive experience. However, with the support of the credit classification model, the manager can accurately evaluate the applicant's credit score. Support Vector Machine (SVM) classification is currently an active research area and successfully solves classification problems in many domains. This study used three strategies to construct the hybrid SVM-based credit scoring models to evaluate the applicant's credit score from the applicant's input features. Two credit datasets in UCI database are selected as the experimental data to demonstrate the accuracy of the SVM classifier. Compared with neural networks, genetic programming, and decision tree classifiers, the SVM classifier achieved an identical classificatory accuracy with relatively few input features. Additionally, combining genetic algorithms with SVM classifier, the proposed hybrid GA-SVM strategy can simultaneously perform feature selection task and model parameters optimization. Experimental results show that SVM is a promising addition to the existing data mining methods.

Proceedings ArticleDOI
23 Jul 2007
TL;DR: This work presents a general SVM learning algorithm that efficiently finds a globally optimal solution to a straightforward relaxation of MAP, and shows its method to produce statistically significant improvements in MAP scores.
Abstract: Machine learning is commonly used to improve ranked retrieval systems. Due to computational difficulties, few learning techniques have been developed to directly optimize for mean average precision (MAP), despite its widespread use in evaluating such systems. Existing approaches optimizing MAP either do not find a globally optimal solution, or are computationally expensive. In contrast, we present a general SVM learning algorithm that efficiently finds a globally optimal solution to a straightforward relaxation of MAP. We evaluate our approach using the TREC 9 and TREC 10 Web Track corpora (WT10g), comparing against SVMs optimized for accuracy and ROCArea. In most cases we show our method to produce statistically significant improvements in MAP scores.

Proceedings ArticleDOI
29 Sep 2007
TL;DR: This paper proposes Adaptive Support Vector Machines (A-SVMs) as a general method to adapt one or more existing classifiers of any type to the new dataset and outperforms several baseline and competing methods in terms of classification accuracy and efficiency in cross-domain concept detection in the TRECVID corpus.
Abstract: Many multimedia applications can benefit from techniques for adapting existing classifiers to data with different distributions. One example is cross-domain video concept detection which aims to adapt concept classifiers across various video domains. In this paper, we explore two key problems for classifier adaptation: (1) how to transform existing classifier(s) into an effective classifier for a new dataset that only has a limited number of labeled examples, and (2) how to select the best existing classifier(s) for adaptation. For the first problem, we propose Adaptive Support Vector Machines (A-SVMs) as a general method to adapt one or more existing classifiers of any type to the new dataset. It aims to learn the "delta function" between the original and adapted classifier using an objective function similar to SVMs. For the second problem, we estimate the performance of each existing classifier on the sparsely-labeled new dataset by analyzing its score distribution and other meta features, and select the classifiers with the best estimated performance. The proposed method outperforms several baseline and competing methods in terms of classification accuracy and efficiency in cross-domain concept detection in the TRECVID corpus.

28 Jun 2007
TL;DR: TMVA as mentioned in this paper is a toolkit that hosts a large variety of multivariate classification algorithms, ranging from rectangular cut optimization using a genetic algorithm and from one-dimensional likelihood estimators, over linear and nonlinear discriminants and neural networks, to sophisticated more recent classifiers such as a support vector machine, boosted decision trees and rule ensemble fitting.
Abstract: n high-energy physics, with the search for ever smaller signals in ever larger data sets, it has become essential to extract a maximum of the available information from the data. Multivariate classification methods based on machine learning techniques have become a fundamental ingredient to most analyses. Also the multivariate classifiers themselves have significantly evolved in recent years. Statisticians have found new ways to tune and to combine classifiers to further gain in performance. Integrated into the analysis framework ROOT, TMVA is a toolkit which hosts a large variety of multivariate classification algorithms. They range from rectangular cut optimization using a genetic algorithm and from one- and multidimensional likelihood estimators, over linear and nonlinear discriminants and neural networks, to sophisticated more recent classifiers such as a support vector machine, boosted decision trees and rule ensemble fitting. TMVA manages the simultaneous training, testing, and performance evaluation of all these classifiers with a user-friendly interface, and expedites the application of the trained classifiers to data.

Journal ArticleDOI
TL;DR: An automatic road-sign detection and recognition system based on support vector machines that is able to detect and recognize circular, rectangular, triangular, and octagonal signs and, hence, covers all existing Spanish traffic-sign shapes.
Abstract: This paper presents an automatic road-sign detection and recognition system based on support vector machines (SVMs). In automatic traffic-sign maintenance and in a visual driver-assistance system, road-sign detection and recognition are two of the most important functions. Our system is able to detect and recognize circular, rectangular, triangular, and octagonal signs and, hence, covers all existing Spanish traffic-sign shapes. Road signs provide drivers important information and help them to drive more safely and more easily by guiding and warning them and thus regulating their actions. The proposed recognition system is based on the generalization properties of SVMs. The system consists of three stages: 1) segmentation according to the color of the pixel; 2) traffic-sign detection by shape classification using linear SVMs; and 3) content recognition based on Gaussian-kernel SVMs. Because of the used segmentation stage by red, blue, yellow, white, or combinations of these colors, all traffic signs can be detected, and some of them can be detected by several colors. Results show a high success rate and a very low amount of false positives in the final recognition stage. From these results, we can conclude that the proposed algorithm is invariant to translation, rotation, scale, and, in many situations, even to partial occlusions

Journal ArticleDOI
TL;DR: Through both simulated data and real life data, it is shown that this method performs very well in multivariate classification problems, often outperforms the PAM method and can be as competitive as the support vector machines classifiers.
Abstract: In this paper, we introduce a modified version of linear discriminant analysis, called the "shrunken centroids regularized discriminant analysis" (SCRDA). This method generalizes the idea of the "nearest shrunken centroids" (NSC) (Tibshirani and others, 2003) into the classical discriminant analysis. The SCRDA method is specially designed for classification problems in high dimension low sample size situations, for example, microarray data. Through both simulated data and real life data, it is shown that this method performs very well in multivariate classification problems, often outperforms the PAM method (using the NSC algorithm) and can be as competitive as the support vector machines classifiers. It is also suitable for feature elimination purpose and can be used as gene selection method. The open source R package for this method (named "rda") is available on CRAN (http://www.r-project.org) for download and testing.

Journal ArticleDOI
TL;DR: The introduction of the composite-kernel framework drastically improves results, and the new fast formulation ranks almost linearly in the computational cost, rather than cubic as in the original method, thus allowing the use of this method in remote-sensing applications.
Abstract: This paper presents a semi-supervised graph-based method for the classification of hyperspectral images. The method is designed to handle the special characteristics of hyperspectral images, namely, high-input dimension of pixels, low number of labeled samples, and spatial variability of the spectral signature. To alleviate these problems, the method incorporates three ingredients, respectively. First, being a kernel-based method, it combats the curse of dimensionality efficiently. Second, following a semi-supervised approach, it exploits the wealth of unlabeled samples in the image, and naturally gives relative importance to the labeled ones through a graph-based methodology. Finally, it incorporates contextual information through a full family of composite kernels. Noting that the graph method relies on inverting a huge kernel matrix formed by both labeled and unlabeled samples, we originally introduce the Nystro umlm method in the formulation to speed up the classification process. The presented semi-supervised-graph-based method is compared to state-of-the-art support vector machines in the classification of hyperspectral data. The proposed method produces better classification maps, which capture the intrinsic structure collectively revealed by labeled and unlabeled points. Good and stable accuracy is produced in ill-posed classification problems (high dimensional spaces and low number of labeled samples). In addition, the introduction of the composite-kernel framework drastically improves results, and the new fast formulation ranks almost linearly in the computational cost, rather than cubic as in the original method, thus allowing the use of this method in remote-sensing applications.

Journal ArticleDOI
TL;DR: A methodological approach to the classification of pigmented skin lesions in dermoscopy images is presented and the issue of class imbalance is addressed using various sampling strategies and the classifier generalization error is estimated using Monte Carlo cross validation.

Proceedings ArticleDOI
17 Jun 2007
TL;DR: This paper introduces an algorithm for learning shapelet features, a set of mid-level features that are built from low-level gradient information that discriminates between pedestrian and non-pedestrian classes on the INRIA dataset.
Abstract: In this paper, we address the problem of detecting pedestrians in still images. We introduce an algorithm for learning shapelet features, a set of mid-level features. These features are focused on local regions of the image and are built from low-level gradient information that discriminates between pedestrian and non-pedestrian classes. Using Ad-aBoost, these shapelet features are created as a combination of oriented gradient responses. To train the final classifier, we use AdaBoost for a second time to select a subset of our learned shapelets. By first focusing locally on smaller feature sets, our algorithm attempts to harvest more useful information than by examining all the low-level features together. We present quantitative results demonstrating the effectiveness of our algorithm. In particular, we obtain an error rate 14 percentage points lower (at 10-6 FPPW) than the previous state of the art detector of Dalal and Triggs on the INRIA dataset.

Proceedings ArticleDOI
17 Jun 2007
TL;DR: A hierarchical model that can be characterized as a constellation of bags-of-features and that is able to combine both spatial and spatial-temporal features is proposed and shown to improve the classification performance over bag of feature models.
Abstract: We present a novel model for human action categorization. A video sequence is represented as a collection of spatial and spatial-temporal features by extracting static and dynamic interest points. We propose a hierarchical model that can be characterized as a constellation of bags-of-features and that is able to combine both spatial and spatial-temporal features. Given a novel video sequence, the model is able to categorize human actions in a frame-by-frame basis. We test the model on a publicly available human action dataset [2] and show that our new method performs well on the classification task. We also conducted control experiments to show that the use of the proposed mixture of hierarchical models improves the classification performance over bag of feature models. An additional experiment shows that using both dynamic and static features provides a richer representation of human actions when compared to the use of a single feature type, as demonstrated by our evaluation in the classification task.

Proceedings ArticleDOI
15 Feb 2007
TL;DR: In this article, a support vector machine (SVM) was used to construct a new multi-class JPEG steganalyzer with markedly improved performance by extending the 23 DCT feature set and applying calibration to the Markov features.
Abstract: Blind steganalysis based on classifying feature vectors derived from images is becoming increasingly more powerful. For steganalysis of JPEG images, features derived directly in the embedding domain from DCT coefficients appear to achieve the best performance (e.g., the DCT features10 and Markov features21). The goal of this paper is to construct a new multi-class JPEG steganalyzer with markedly improved performance. We do so first by extending the 23 DCT feature set,10 then applying calibration to the Markov features described in21 and reducing their dimension. The resulting feature sets are merged, producing a 274-dimensional feature vector. The new feature set is then used to construct a Support Vector Machine multi-classifier capable of assigning stego images to six popular steganographic algorithms-F5,22 OutGuess,18 Model Based Steganography without ,19 and with20 deblocking, JP Hide&Seek,1 and Steghide.14 Comparing to our previous work on multi-classification,11, 12 the new feature set provides significantly more reliable results.

Journal ArticleDOI
TL;DR: This paper illustrates the use of a Decision Tree that identifies the best features from a given set of samples for the purpose of classification using Proximal Support Vector Machine (PSVM), which has the capability to efficiently classify the faults using statistical features.

Journal ArticleDOI
TL;DR: Two hybrid approaches for modeling IDS are presented as a hierarchical hybrid intelligent system model (DT-SVM) and an ensemble approach combining the base classifiers to maximize detection accuracy and minimize computational complexity.

Journal ArticleDOI
01 Oct 2007
TL;DR: This paper presents a new approach of combination of SVM and DGSOT, which starts with an initial training set and expands it gradually using the clustering structure produced by the D GSOT algorithm, which has proved to overcome the drawbacks of traditional hierarchical clustering algorithms.
Abstract: Whenever an intrusion occurs, the security and value of a computer system is compromised. Network-based attacks make it difficult for legitimate users to access various network services by purposely occupying or sabotaging network resources and services. This can be done by sending large amounts of network traffic, exploiting well-known faults in networking services, and by overloading network hosts. Intrusion Detection attempts to detect computer attacks by examining various data records observed in processes on the network and it is split into two groups, anomaly detection systems and misuse detection systems. Anomaly detection is an attempt to search for malicious behavior that deviates from established normal patterns. Misuse detection is used to identify intrusions that match known attack scenarios. Our interest here is in anomaly detection and our proposed method is a scalable solution for detecting network-based anomalies. We use Support Vector Machines (SVM) for classification. The SVM is one of the most successful classification algorithms in the data mining area, but its long training time limits its use. This paper presents a study for enhancing the training time of SVM, specifically when dealing with large data sets, using hierarchical clustering analysis. We use the Dynamically Growing Self-Organizing Tree (DGSOT) algorithm for clustering because it has proved to overcome the drawbacks of traditional hierarchical clustering algorithms (e.g., hierarchical agglomerative clustering). Clustering analysis helps find the boundary points, which are the most qualified data points to train SVM, between two classes. We present a new approach of combination of SVM and DGSOT, which starts with an initial training set and expands it gradually using the clustering structure produced by the DGSOT algorithm. We compare our approach with the Rocchio Bundling technique and random selection in terms of accuracy loss and training time gain using a single benchmark real data set. We show that our proposed variations contribute significantly in improving the training process of SVM with high generalization accuracy and outperform the Rocchio Bundling technique.

Journal ArticleDOI
TL;DR: A new SVM approach is proposed, named Enhanced SVM, which combines these two methods in order to provide unsupervised learning and low false alarm capability, similar to that of a supervised S VM approach.

Journal ArticleDOI
TL;DR: The proposed SVM-based fusion approach outperforms all other approaches and significantly improves the results of a single SVM, which is trained on the whole multisensor data set.
Abstract: The classification of multisensor data sets, consisting of multitemporal synthetic aperture radar data and optical imagery, is addressed. The concept is based on the decision fusion of different outputs. Each data source is treated separately and classified by a support vector machine (SVM). Instead of fusing the final classification outputs (i.e., land cover classes), the original outputs of each SVM discriminant function are used in the subsequent fusion process. This fusion is performed by another SVM, which is trained on the a priori outputs. In addition, two voting schemes are applied to create the final classification results. The results are compared with well-known parametric and nonparametric classifier methods, i.e., decision trees, the maximum-likelihood classifier, and classifier ensembles. The proposed SVM-based fusion approach outperforms all other approaches and significantly improves the results of a single SVM, which is trained on the whole multisensor data set.

Proceedings ArticleDOI
04 Oct 2007
TL;DR: This study compares the predictive accuracy of several machine learning methods including Logistic Regression (LR), Classification and Regression Trees (CART), Bayesian Additive Regression trees (BART), Support Vector Machines (SVM), Random Forests (RF), and Neural Networks (NNet) for predicting phishing emails.
Abstract: There are many applications available for phishing detection. However, unlike predicting spam, there are only few studies that compare machine learning techniques in predicting phishing. The present study compares the predictive accuracy of several machine learning methods including Logistic Regression (LR), Classification and Regression Trees (CART), Bayesian Additive Regression Trees (BART), Support Vector Machines (SVM), Random Forests (RF), and Neural Networks (NNet) for predicting phishing emails. A data set of 2889 phishing and legitimate emails is used in the comparative study. In addition, 43 features are used to train and test the classifiers.

Proceedings ArticleDOI
06 Nov 2007
TL;DR: It is demonstrated that active learning is capable of solving the class imbalance problem by providing the learner more balanced classes and an efficient way of selecting informative instances from a smaller pool of samples for active learning which does not necessitate a search through the entire dataset.
Abstract: This paper is concerned with the class imbalance problem which has been known to hinder the learning performance of classification algorithms. The problem occurs when there are significantly less number of observations of the target concept. Various real-world classification tasks, such as medical diagnosis, text categorization and fraud detection suffer from this phenomenon. The standard machine learning algorithms yield better prediction performance with balanced datasets. In this paper, we demonstrate that active learning is capable of solving the class imbalance problem by providing the learner more balanced classes. We also propose an efficient way of selecting informative instances from a smaller pool of samples for active learning which does not necessitate a search through the entire dataset. The proposed method yields an efficient querying system and allows active learning to be applied to very large datasets. Our experimental results show that with an early stopping criteria, active learning achieves a fast solution with competitive prediction performance in imbalanced data classification.

01 Jan 2007
TL;DR: In this article, the authors develop a broadly applicable parallel programming method, one that is easily applied to many different learning algorithms, such as locally weighted linear regression (LWLR), k-means, logistic regression (LR), naive Bayes (NB), SVM, ICA, PCA, gaussian discriminant analysis (GDA), EM, and backpropagation (NN).
Abstract: We are at the beginning of the multicore era. Computers will have increasingly many cores (processors), but there is still no good programming framework for these architectures, and thus no simple and unified way for machine learning to take advantage of the potential speed up. In this paper, we develop a broadly applicable parallel programming method, one that is easily applied to many different learning algorithms. Our work is in distinct contrast to the tradition in machine learning of designing (often ingenious) ways to speed up a single algorithm at a time. Specifically, we show that algorithms that fit the Statistical Query model [15] can be written in a certain ‘summation form,’ which allows them to be easily parallelized on multicore computers. We adapt Google's map-reduce [7] paradigm to demonstrate this parallel speed up technique on a variety of learning algorithms including locally weighted linear regression (LWLR), k-means, logistic regression (LR), naive Bayes (NB), SVM, ICA, PCA, gaussian discriminant analysis (GDA), EM, and backpropagation (NN). Our experimental results show basically linear speedup with an increasing number of processors.

Book ChapterDOI
14 Feb 2007
TL;DR: Support vector machines represent an extension to nonlinear models of the generalized portrait algorithm developed by Vapnik and Lerner, and are a group of supervised learning methods that can be applied to classification or regression.
Abstract: Kernel-based techniques (such as support vector machines, Bayes point machines, kernel principal component analysis, and Gaussian processes) represent a major development in machine learning algorithms. Support vector machines (SVM) are a group of supervised learning methods that can be applied to classification or regression. In a short period of time, SVM found numerous applications in chemistry, such as in drug design (discriminating between ligands and nonligands, inhibitors and noninhibitors, etc.), quantitative structure-activity relationships (QSAR, where SVM regression is used to predict various physical, chemical, or biological properties), chemometrics (optimization of chromatographic separation or compound concentration prediction from spectral data as examples), sensors (for qualitative and quantitative prediction from sensor data), chemical engineering (fault detection and modeling of industrial processes), and text mining (automatic recognition of scientific information). Support vector machines represent an extension to nonlinear models of the generalized portrait algorithm developed by Vapnik and Lerner. The SVM algorithm is based on the statistical learning theory and the Vapnik–Chervonenkis