scispace - formally typeset
Search or ask a question

Showing papers on "Support vector machine published in 2006"


Journal ArticleDOI
TL;DR: A new learning algorithm called ELM is proposed for feedforward neural networks (SLFNs) which randomly chooses hidden nodes and analytically determines the output weights of SLFNs which tends to provide good generalization performance at extremely fast learning speed.

10,217 citations


Journal ArticleDOI
TL;DR: Support vector machines are becoming popular in a wide variety of biological applications, but how do they work and what are their most promising applications in the life sciences?
Abstract: Support vector machines (SVMs) are becoming popular in a wide variety of biological applications. But, what exactly are SVMs and how do they work? And what are their most promising applications in the life sciences?

3,801 citations


Proceedings ArticleDOI
25 Jun 2006
TL;DR: A large-scale empirical comparison between ten supervised learning methods: SVMs, neural nets, logistic regression, naive bayes, memory-based learning, random forests, decision trees, bagged trees, boosted trees, and boosted stumps is presented.
Abstract: A number of supervised learning methods have been introduced in the last decade. Unfortunately, the last comprehensive empirical evaluation of supervised learning was the Statlog Project in the early 90's. We present a large-scale empirical comparison between ten supervised learning methods: SVMs, neural nets, logistic regression, naive bayes, memory-based learning, random forests, decision trees, bagged trees, boosted trees, and boosted stumps. We also examine the effect that calibrating the models via Platt Scaling and Isotonic Regression has on their performance. An important aspect of our study is the use of a variety of performance criteria to evaluate the learning methods.

2,450 citations


Proceedings ArticleDOI
20 Aug 2006
TL;DR: A Cutting Plane Algorithm for training linear SVMs that provably has training time 0(s,n) for classification problems and o(sn log (n)) for ordinal regression problems and several orders of magnitude faster than decomposition methods like svm light for large datasets.
Abstract: Linear Support Vector Machines (SVMs) have become one of the most prominent machine learning techniques for high-dimensional sparse data commonly encountered in applications like text classification, word-sense disambiguation, and drug design. These applications involve a large number of examples n as well as a large number of features N, while each example has only s

2,173 citations


Journal ArticleDOI
17 Jun 2006
TL;DR: A large-scale evaluation of an approach that represents images as distributions of features extracted from a sparse set of keypoint locations and learns a Support Vector Machine classifier with kernels based on two effective measures for comparing distributions, the Earth Mover’s Distance and the χ2 distance.
Abstract: Recently, methods based on local image features have shown promise for texture and object recognition tasks. This paper presents a large-scale evaluation of an approach that represents images as distributions (signatures or histograms) of features extracted from a sparse set of keypoint locations and learns a Support Vector Machine classifier with kernels based on two effective measures for comparing distributions, the Earth Mover’s Distance and the ÷2 distance. We first evaluate the performance of our approach with different keypoint detectors and descriptors, as well as different kernels and classifiers. We then conduct a comparative evaluation with several state-of-the-art recognition methods on 4 texture and 5 object databases. On most of these databases, our implementation exceeds the best reported results and achieves comparable performance on the rest. Finally, we investigate the influence of background correlations on recognition performance.

1,863 citations


Book
01 Jan 2006
TL;DR: This book discusses Feature Extraction for Classification of Proteomic Mass Spectra, Sequence Motifs: Highly Predictive Features of Protein Function, and Combining a Filter Method with SVMs.
Abstract: An Introduction to Feature Extraction.- An Introduction to Feature Extraction.- Feature Extraction Fundamentals.- Learning Machines.- Assessment Methods.- Filter Methods.- Search Strategies.- Embedded Methods.- Information-Theoretic Methods.- Ensemble Learning.- Fuzzy Neural Networks.- Feature Selection Challenge.- Design and Analysis of the NIPS2003 Challenge.- High Dimensional Classification with Bayesian Neural Networks and Dirichlet Diffusion Trees.- Ensembles of Regularized Least Squares Classifiers for High-Dimensional Problems.- Combining SVMs with Various Feature Selection Strategies.- Feature Selection with Transductive Support Vector Machines.- Variable Selection using Correlation and Single Variable Classifier Methods: Applications.- Tree-Based Ensembles with Dynamic Soft Feature Selection.- Sparse, Flexible and Efficient Modeling using L 1 Regularization.- Margin Based Feature Selection and Infogain with Standard Classifiers.- Bayesian Support Vector Machines for Feature Ranking and Selection.- Nonlinear Feature Selection with the Potential Support Vector Machine.- Combining a Filter Method with SVMs.- Feature Selection via Sensitivity Analysis with Direct Kernel PLS.- Information Gain, Correlation and Support Vector Machines.- Mining for Complex Models Comprising Feature Selection and Classification.- Combining Information-Based Supervised and Unsupervised Feature Selection.- An Enhanced Selective Naive Bayes Method with Optimal Discretization.- An Input Variable Importance Definition based on Empirical Data Probability Distribution.- New Perspectives in Feature Extraction.- Spectral Dimensionality Reduction.- Constructing Orthogonal Latent Features for Arbitrary Loss.- Large Margin Principles for Feature Selection.- Feature Extraction for Classification of Proteomic Mass Spectra: A Comparative Study.- Sequence Motifs: Highly Predictive Features of Protein Function.

1,593 citations


Journal ArticleDOI
TL;DR: This research presents a genetic algorithm approach for feature selection and parameters optimization to solve the problem of optimizing parameters and feature subset without degrading the SVM classification accuracy.
Abstract: Support Vector Machines, one of the new techniques for pattern classification, have been widely used in many application areas. The kernel parameters setting for SVM in a training process impacts on the classification accuracy. Feature selection is another factor that impacts classification accuracy. The objective of this research is to simultaneously optimize the parameters and feature subset without degrading the SVM classification accuracy. We present a genetic algorithm approach for feature selection and parameters optimization to solve this kind of problem. We tried several real-world datasets using the proposed GA-based approach and the Grid algorithm, a traditional method of performing parameters searching. Compared with the Grid algorithm, our proposed GA-based approach significantly improves the classification accuracy and has fewer input features for support vector machines. q 2005 Elsevier Ltd. All rights reserved.

1,316 citations


Journal ArticleDOI
TL;DR: It is shown that using CV to compute an error estimate for a classifier that has itself been tuned using CV gives a significantly biased estimate of the true error.
Abstract: Cross-validation (CV) is an effective method for estimating the prediction error of a classifier. Some recent articles have proposed methods for optimizing classifiers by choosing classifier parameter values that minimize the CV error estimate. We have evaluated the validity of using the CV error estimate of the optimized classifier as an estimate of the true error expected on independent data. We used CV to optimize the classification parameters for two kinds of classifiers; Shrunken Centroids and Support Vector Machines (SVM). Random training datasets were created, with no difference in the distribution of the features between the two classes. Using these "null" datasets, we selected classifier parameter values that minimized the CV error estimate. 10-fold CV was used for Shrunken Centroids while Leave-One-Out-CV (LOOCV) was used for the SVM. Independent test data was created to estimate the true error. With "null" and "non null" (with differential expression between the classes) data, we also tested a nested CV procedure, where an inner CV loop is used to perform the tuning of the parameters while an outer CV is used to compute an estimate of the error. The CV error estimate for the classifier with the optimal parameters was found to be a substantially biased estimate of the true error that the classifier would incur on independent data. Even though there is no real difference between the two classes for the "null" datasets, the CV error estimate for the Shrunken Centroid with the optimal parameters was less than 30% on 18.5% of simulated training data-sets. For SVM with optimal parameters the estimated error rate was less than 30% on 38% of "null" data-sets. Performance of the optimized classifiers on the independent test set was no better than chance. The nested CV procedure reduces the bias considerably and gives an estimate of the error that is very close to that obtained on the independent testing set for both Shrunken Centroids and SVM classifiers for "null" and "non-null" data distributions. We show that using CV to compute an error estimate for a classifier that has itself been tuned using CV gives a significantly biased estimate of the true error. Proper use of CV for estimating true error of a classifier developed using a well defined algorithm requires that all steps of the algorithm, including classifier parameter tuning, be repeated in each CV loop. A nested CV procedure provides an almost unbiased estimate of the true error.

1,314 citations


Proceedings Article
04 Dec 2006
TL;DR: This work shows that algorithms that fit the Statistical Query model can be written in a certain "summation form," which allows them to be easily parallelized on multicore computers and shows basically linear speedup with an increasing number of processors.
Abstract: We are at the beginning of the multicore era. Computers will have increasingly many cores (processors), but there is still no good programming framework for these architectures, and thus no simple and unified way for machine learning to take advantage of the potential speed up. In this paper, we develop a broadly applicable parallel programming method, one that is easily applied to many different learning algorithms. Our work is in distinct contrast to the tradition in machine learning of designing (often ingenious) ways to speed up a single algorithm at a time. Specifically, we show that algorithms that fit the Statistical Query model [15] can be written in a certain "summation form," which allows them to be easily parallelized on multicore computers. We adapt Google's map-reduce [7] paradigm to demonstrate this parallel speed up technique on a variety of learning algorithms including locally weighted linear regression (LWLR), k-means, logistic regression (LR), naive Bayes (NB), SVM, ICA, PCA, gaussian discriminant analysis (GDA), EM, and backpropagation (NN). Our experimental results show basically linear speedup with an increasing number of processors.

1,310 citations


Proceedings ArticleDOI
17 Jun 2006
TL;DR: This work considers visual category recognition in the framework of measuring similarities, or equivalently perceptual distances, to prototype examples of categories and proposes a hybrid of these two methods which deals naturally with the multiclass setting, has reasonable computational complexity both in training and at run time, and yields excellent results in practice.
Abstract: We consider visual category recognition in the framework of measuring similarities, or equivalently perceptual distances, to prototype examples of categories. This approach is quite flexible, and permits recognition based on color, texture, and particularly shape, in a homogeneous framework. While nearest neighbor classifiers are natural in this setting, they suffer from the problem of high variance (in bias-variance decomposition) in the case of limited sampling. Alternatively, one could use support vector machines but they involve time-consuming optimization and computation of pairwise distances. We propose a hybrid of these two methods which deals naturally with the multiclass setting, has reasonable computational complexity both in training and at run time, and yields excellent results in practice. The basic idea is to find close neighbors to a query sample and train a local support vector machine that preserves the distance function on the collection of neighbors. Our method can be applied to large, multiclass data sets for which it outperforms nearest neighbor and support vector machines, and remains efficient when the problem becomes intractable for support vector machines. A wide variety of distance functions can be used and our experiments show state-of-the-art performance on a number of benchmark data sets for shape and texture classification (MNIST, USPS, CUReT) and object recognition (Caltech- 101). On Caltech-101 we achieved a correct classification rate of 59.05%(±0.56%) at 15 training images per class, and 66.23%(±0.48%) at 30 training images.

1,265 citations


Proceedings ArticleDOI
17 Jun 2006
TL;DR: This paper proposes a novel on-line AdaBoost feature selection method and demonstrates the multifariousness of the method on such diverse tasks as learning complex background models, visual tracking and object detection.
Abstract: Boosting has become very popular in computer vision, showing impressive performance in detection and recognition tasks. Mainly off-line training methods have been used, which implies that all training data has to be a priori given; training and usage of the classifier are separate steps. Training the classifier on-line and incrementally as new data becomes available has several advantages and opens new areas of application for boosting in computer vision. In this paper we propose a novel on-line AdaBoost feature selection method. In conjunction with efficient feature extraction methods the method is real time capable. We demonstrate the multifariousness of the method on such diverse tasks as learning complex background models, visual tracking and object detection. All approaches benefit significantly by the on-line training.

Journal ArticleDOI
TL;DR: This work examines the idea of using the GMM supervector in a support vector machine (SVM) classifier and proposes two new SVM kernels based on distance metrics between GMM models that produce excellent classification accuracy in a NIST speaker recognition evaluation task.
Abstract: Gaussian mixture models (GMMs) have proven extremely successful for text-independent speaker recognition. The standard training method for GMM models is to use MAP adaptation of the means of the mixture components based on speech from a target speaker. Recent methods in compensation for speaker and channel variability have proposed the idea of stacking the means of the GMM model to form a GMM mean supervector. We examine the idea of using the GMM supervector in a support vector machine (SVM) classifier. We propose two new SVM kernels based on distance metrics between GMM models. We show that these SVM kernels produce excellent classification accuracy in a NIST speaker recognition evaluation task.

Journal ArticleDOI
TL;DR: This framework of composite kernels demonstrates enhanced classification accuracy as compared to traditional approaches that take into account the spectral information only, flexibility to balance between the spatial and spectral information in the classifier, and computational efficiency.
Abstract: This letter presents a framework of composite kernel machines for enhanced classification of hyperspectral images. This novel method exploits the properties of Mercer's kernels to construct a family of composite kernels that easily combine spatial and spectral information. This framework of composite kernels demonstrates: 1) enhanced classification accuracy as compared to traditional approaches that take into account the spectral information only: 2) flexibility to balance between the spatial and spectral information in the classifier; and 3) computational efficiency. In addition, the proposed family of kernel classifiers opens a wide field for future developments in which spatial and spectral information can be easily integrated.

Book ChapterDOI
01 Jan 2006
TL;DR: This article investigates the performance of combining support vector machines (SVM) and various feature selection strategies, some are filter-type approaches: general feature selection methods independent of SVM, and some are wrapper-type methods: modifications of S VM which can be used to select features.
Abstract: This article investigates the performance of combining support vector machines (SVM) and various feature selection strategies. Some of them are filter-type approaches: general feature selection methods independent of SVM, and some are wrapper-type methods: modifications of SVM which can be used to select features. We apply these strategies while participating to the NIPS 2003 Feature Selection Challenge and rank third as a group.

Journal ArticleDOI
TL;DR: An asymmetric bagging and random subspace SVM (ABRS-SVM) is built to solve three problems and further improve the relevance feedback performance.
Abstract: Relevance feedback schemes based on support vector machines (SVM) have been widely used in content-based image retrieval (CBIR). However, the performance of SVM-based relevance feedback is often poor when the number of labeled positive feedback samples is small. This is mainly due to three reasons: 1) an SVM classifier is unstable on a small-sized training set, 2) SVM's optimal hyperplane may be biased when the positive feedback samples are much less than the negative feedback samples, and 3) overfitting happens because the number of feature dimensions is much higher than the size of the training set. In this paper, we develop a mechanism to overcome these problems. To address the first two problems, we propose an asymmetric bagging-based SVM (AB-SVM). For the third problem, we combine the random subspace method and SVM for relevance feedback, which is named random subspace SVM (RS-SVM). Finally, by integrating AB-SVM and RS-SVM, an asymmetric bagging and random subspace SVM (ABRS-SVM) is built to solve these three problems and further improve the relevance feedback performance

01 Jan 2006
TL;DR: This research identifies a set of features that are key to the superior performance under the supervised learning setup, and shows that a small subset of features always plays a significant role in the link prediction job.
Abstract: Social network analysis has attracted much attention in recent years. Link prediction is a key research directions within this area. In this research, we study link prediction as a supervised learning task. Along the way, we identify a set of features that are key to the superior performance under the supervised learning setup. The identified features are very easy to compute, and at the same time surprisingly effective in solving the link prediction problem. We also explain the effectiveness of the features from their class density distribution. Then we compare different classes of supervised learning algorithms in terms of their prediction performance using various performance metrics, such as accuracy, precision-recall, F-values, squared error etc. with a 5-fold cross validation. Our results on two practical social network datasets shows that most of the well-known classification algorithms (decision tree, k-nn,multilayer perceptron, SVM, rbf network) can predict link with surpassing performances, but SVM defeats all of them with narrow margin in all different performance measures. Again, ranking of features with popular feature ranking algorithms shows that a small subset of features always plays a significant role in the link prediction job.

Journal ArticleDOI
TL;DR: A learning-based method for recovering 3D human body pose from single images and monocular image sequences, embedded in a novel regressive tracking framework, using dynamics from the previous state estimate together with a learned regression value to disambiguate the pose.
Abstract: We describe a learning-based method for recovering 3D human body pose from single images and monocular image sequences. Our approach requires neither an explicit body model nor prior labeling of body parts in the image. Instead, it recovers pose by direct nonlinear regression against shape descriptor vectors extracted automatically from image silhouettes. For robustness against local silhouette segmentation errors, silhouette shape is encoded by histogram-of-shape-contexts descriptors. We evaluate several different regression methods: ridge regression, relevance vector machine (RVM) regression, and support vector machine (SVM) regression over both linear and kernel bases. The RVMs provide much sparser regressors without compromising performance, and kernel bases give a small but worthwhile improvement in performance. The loss of depth and limb labeling information often makes the recovery of 3D pose from single silhouettes ambiguous. To handle this, the method is embedded in a novel regressive tracking framework, using dynamics from the previous state estimate together with a learned regression value to disambiguate the pose. We show that the resulting system tracks long sequences stably. For realism and good generalization over a wide range of viewpoints, we train the regressors on images resynthesized from real human motion capture data. The method is demonstrated for several representations of full body pose, both quantitatively on independent but similar test data and qualitatively on real image sequences. Mean angular errors of 4-6/spl deg/ are obtained for a variety of walking motions.

Journal ArticleDOI
TL;DR: This work proposes a learning method, MILES (multiple-instance learning via embedded instance selection), which converts the multiple- instance learning problem to a standard supervised learning problem that does not impose the assumption relating instance labels to bag labels.
Abstract: Multiple-instance problems arise from the situations where training class labels are attached to sets of samples (named bags), instead of individual samples within each bag (called instances). Most previous multiple-instance learning (MIL) algorithms are developed based on the assumption that a bag is positive if and only if at least one of its instances is positive. Although the assumption works well in a drug activity prediction problem, it is rather restrictive for other applications, especially those in the computer vision area. We propose a learning method, MILES (multiple-instance learning via embedded instance selection), which converts the multiple-instance learning problem to a standard supervised learning problem that does not impose the assumption relating instance labels to bag labels. MILES maps each bag into a feature space defined by the instances in the training bags via an instance similarity measure. This feature mapping often provides a large number of redundant or irrelevant features. Hence, 1-norm SVM is applied to select important features as well as construct classifiers simultaneously. We have performed extensive experiments. In comparison with other methods, MILES demonstrates competitive classification accuracy, high computation efficiency, and robustness to labeling uncertainty

Journal ArticleDOI
TL;DR: The results indicate that while all methods attained acceptable performance levels, SWLDA and FLD provide the best overall performance and implementation characteristics for practical classification of P300 Speller data.
Abstract: This study assesses the relative performance characteristics of five established classification techniques on data collected using the P300 Speller paradigm, originally described by Farwell and Donchin (1988 Electroenceph. Clin. Neurophysiol. 70 510). Four linear methods: Pearson's correlation method (PCM), Fisher's linear discriminant (FLD), stepwise linear discriminant analysis (SWLDA) and a linear support vector machine (LSVM); and one nonlinear method: Gaussian kernel support vector machine (GSVM), are compared for classifying offline data from eight users. The relative performance of the classifiers is evaluated, along with the practical concerns regarding the implementation of the respective methods. The results indicate that while all methods attained acceptable performance levels, SWLDA and FLD provide the best overall performance and implementation characteristics for practical classification of P300 Speller data.

Journal ArticleDOI
TL;DR: Tests on simple examples as well as on a number of public data sets show the advantages of the proposed approach in both computation time and test set correctness.
Abstract: A new approach to support vector machine (SVM) classification is proposed wherein each of two data sets are proximal to one of two distinct planes that are not parallel to each other. Each plane is generated such that it is closest to one of the two data sets and as far as possible from the other data set. Each of the two nonparallel proximal planes is obtained by a single MATLAB command as the eigenvector corresponding to a smallest eigenvalue of a generalized eigenvalue problem. Classification by proximity to two distinct nonlinear surfaces generated by a nonlinear kernel also leads to two simple generalized eigenvalue problems. The effectiveness of the proposed method is demonstrated by tests on simple examples as well as on a number of public data sets. These examples show the advantages of the proposed approach in both computation time and test set correctness.

Journal ArticleDOI
S. Munder1, Dariu M. Gavrila1
TL;DR: An in-depth experimental study on pedestrian classification; multiple feature-classifier combinations are examined with respect to their ROC performance and efficiency and show that the novel combination of SVMs with LRF features performs best.
Abstract: Detecting people in images is key for several important application domains in computer vision. This paper presents an in-depth experimental study on pedestrian classification; multiple feature-classifier combinations are examined with respect to their ROC performance and efficiency. We investigate global versus local and adaptive versus nonadaptive features, as exemplified by PCA coefficients, Haar wavelets, and local receptive fields (LRFs). In terms of classifiers, we consider the popular support vector machines (SVMs), feedforward neural networks, and k-nearest neighbor classifier. Experiments are performed on a large data set consisting of 4,000 pedestrian and more than 25,000 nonpedestrian (labeled) images captured in outdoor urban environments. Statistically meaningful results are obtained by analyzing performance variances caused by varying training and test sets. Furthermore, we investigate how classification performance and training sample size are correlated. Sample size is adjusted by increasing the number of manually labeled training data or by employing automatic bootstrapping or cascade techniques. Our experiments show that the novel combination of SVMs with LRF features performs best. A boosted cascade of Haar wavelets can, however, reach quite competitive results, at a fraction of computational cost. The data set used in this paper is made public, establishing a benchmark for this important problem

Proceedings ArticleDOI
14 May 2006
TL;DR: A support vector machine kernel is constructed using the GMM supervector and similarities based on this kernel between the method of SVM nuisance attribute projection (NAP) and the recent results in latent factor analysis are shown.
Abstract: Gaussian mixture models with universal backgrounds (UBMs) have become the standard method for speaker recognition. Typically, a speaker model is constructed by MAP adaptation of the means of the UBM. A GMM supervector is constructed by stacking the means of the adapted mixture components. A recent discovery is that latent factor analysis of this GMM supervector is an effective method for variability compensation. We consider this GMM supervector in the context of support vector machines. We construct a support vector machine kernel using the GMM supervector. We show similarities based on this kernel between the method of SVM nuisance attribute projection (NAP) and the recent results in latent factor analysis. Experiments on a NIST SRE 2005 corpus demonstrate the effectiveness of the new technique.

Journal ArticleDOI
TL;DR: The purpose of this paper is to present and compare these implementations of support vector machines, among the most popular and efficient classification and regression methods currently available.
Abstract: Being among the most popular and efficient classification and regression methods currently available, implementations of support vector machines exist in almost every popular programming language. Currently four R packages contain SVM related software. The purpose of this paper is to present and compare these implementations. (authors' abstract)

Journal ArticleDOI
TL;DR: A novel modified TSVM classifier designed for addressing ill-posed remote-sensing problems is proposed that is able to mitigate the effects of suboptimal model selection and can address multiclass cases.
Abstract: This paper introduces a semisupervised classification method that exploits both labeled and unlabeled samples for addressing ill-posed problems with support vector machines (SVMs). The method is based on recent developments in statistical learning theory concerning transductive inference and in particular transductive SVMs (TSVMs). TSVMs exploit specific iterative algorithms which gradually search a reliable separating hyperplane (in the kernel space) with a transductive process that incorporates both labeled and unlabeled samples in the training phase. Based on an analysis of the properties of the TSVMs presented in the literature, a novel modified TSVM classifier designed for addressing ill-posed remote-sensing problems is proposed. In particular, the proposed technique: 1) is based on a novel transductive procedure that exploits a weighting strategy for unlabeled patterns, based on a time-dependent criterion; 2) is able to mitigate the effects of suboptimal model selection (which is unavoidable in the presence of small-size training sets); and 3) can address multiclass cases. Experimental results confirm the effectiveness of the proposed method on a set of ill-posed remote-sensing classification problems representing different operative conditions

Journal ArticleDOI
TL;DR: This work considers the application of SVMs to speaker and language recognition and uses a sequence kernel that compares sequences of feature vectors and produces a measure of similarity to build upon a simpler mean-squared error classifier to produce a more accurate system.

Proceedings Article
04 Dec 2006
TL;DR: This work introduces Extremely Randomized Clustering Forests - ensembles of randomly created clustering trees - and shows that these provide more accurate results, much faster training and testing and good resistance to background clutter in several state-of-the-art image classification tasks.
Abstract: Some of the most effective recent methods for content-based image classification work by extracting dense or sparse local image descriptors, quantizing them according to a coding rule such as k-means vector quantization, accumulating histograms of the resulting "visual word" codes over the image, and classifying these with a conventional classifier such as an SVM. Large numbers of descriptors and large codebooks are needed for good results and this becomes slow using k-means. We introduce Extremely Randomized Clustering Forests - ensembles of randomly created clustering trees - and show that these provide more accurate results, much faster training and testing and good resistance to background clutter in several state-of-the-art image classification tasks.

Journal ArticleDOI
TL;DR: A Bayesian framework for combining multiple classifiers based on the functional taxonomy constraints is developed using a hierarchy of support vector machine (SVM) classifiers trained on multiple data types to obtain the most probable consistent set of predictions.
Abstract: Motivation: Assigning functions for unknown genes based on diverse large-scale data is a key task in functional genomics. Previous work on gene function prediction has addressed this problem using independent classifiers for each function. However, such an approach ignores the structure of functional class taxonomies, such as the Gene Ontology (GO). Over a hierarchy of functional classes, a group of independent classifiers where each one predicts gene membership to a particular class can produce a hierarchically inconsistent set of predictions, where for a given gene a specific class may be predicted positive while its inclusive parent class is predicted negative. Taking the hierarchical structure into account resolves such inconsistencies and provides an opportunity for leveraging all classifiers in the hierarchy to achieve higher specificity of predictions. Results: We developed a Bayesian framework for combining multiple classifiers based on the functional taxonomy constraints. Using a hierarchy of support vector machine (SVM) classifiers trained on multiple data types, we combined predictions in our Bayesian framework to obtain the most probable consistent set of predictions. Experiments show that over a 105-node subhierarchy of the GO, our Bayesian framework improves predictions for 93 nodes. As an additional benefit, our method also provides implicit calibration of SVM margin outputs to probabilities. Using this method, we make function predictions for multiple proteins, and experimentally confirm predictions for proteins involved in mitosis. Supplementary information: Results for the 105 selected GO classes and predictions for 1059 unknown genes are available at: http://function.princeton.edu/genesite/ Contact: ogt@cs.princeton.edu

Journal ArticleDOI
TL;DR: This paper proposes a novel method using wavelets as input to neural network self-organizing maps and support vector machine for classification of magnetic resonance (MR) images of the human brain and tests the proposed approach using a dataset of 52 MR brain images.

Journal ArticleDOI
TL;DR: In this paper, a support vector machine (SVM) approach is proposed for statistical downscaling of precipitation at monthly time scale, and the effectiveness of this approach is illustrated through its application to meteorological sub-divisions in India.

Journal ArticleDOI
TL;DR: The support vector machine (SVM) is presented as a promising method for hydrological prediction and it is demonstrated that SVM is a very potential candidate for the prediction of long-term discharges.
Abstract: Accurate time- and site-specific forecasts of streamflow and reservoir inflow are important in effective hydropower reservoir management and scheduling. Traditionally, autoregressive moving-average (ARMA) models have been used in modelling water resource time series as a standard representation of stochastic time series. Recently, artificial neural network (ANN) approaches have been proven to be efficient when applied to hydrological prediction. In this paper, the support vector machine (SVM) is presented as a promising method for hydrological prediction. Over-fitting and local optimal solution are unlikely to occur with SVM, which implements the structural risk minimization principle rather than the empirical risk minimization principle. In order to identify appropriate parameters of the SVM prediction model, a shuffled complex evolution algorithm is performed through exponential transformation. The SVM prediction model is tested using the long-term observations of discharges of monthly river fl...