Showing papers on "Linear classifier published in 2008"

PDF

Open Access

Journal Article•

LIBLINEAR: A Library for Large Linear Classification

[...]

Rong-En Fan¹, Kai-Wei Chang¹, Cho-Jui Hsieh¹, Xiang-Rui Wang¹, Chih-Jen Lin¹ - Show less +1 more•Institutions (1)

01 Jun 2008-Journal of Machine Learning Research

TL;DR: LIBLINEAR is an open source library for large-scale linear classification that supports logistic regression and linear support vector machines and provides easy-to-use command-line tools and library calls for users and developers.

...read moreread less

Abstract: LIBLINEAR is an open source library for large-scale linear classification. It supports logistic regression and linear support vector machines. We provide easy-to-use command-line tools and library calls for users and developers. Comprehensive documents are available for both beginners and advanced users. Experiments demonstrate that LIBLINEAR is very efficient on large sparse data sets.

...read moreread less

7,848 citations

Journal Article•DOI•

Support vector machines and kernels for computational biology.

[...]

Asa Ben-Hur¹, Cheng Soon Ong², Sören Sonnenburg², Bernhard Schölkopf², Gunnar Rätsch² - Show less +1 more•Institutions (2)

Colorado State University¹, Max Planck Society²

31 Oct 2008-PLOS Computational Biology

TL;DR: Support vector machines are widely used in computational biology due to their high accuracy, their ability to deal with high-dimensional and large datasets, and their flexibility in modeling diverse sources of data.

...read moreread less

Abstract: The increasing wealth of biological data coming from a large variety of platforms and the continued development of new high-throughput methods for probing biological systems require increasingly more sophisticated computational approaches. Putting all these data in simple-to-use databases is a first step; but realizing the full potential of the data requires algorithms that automatically extract regularities from the data, which can then lead to biological insight. Many of the problems in computational biology are in the form of prediction: starting from prediction of a gene's structure, prediction of its function, interactions, and role in disease. Support vector machines (SVMs) and related kernel methods are extremely good at solving such problems [1]–[3]. SVMs are widely used in computational biology due to their high accuracy, their ability to deal with high-dimensional and large datasets, and their flexibility in modeling diverse sources of data [2], [4]–[6]. The simplest form of a prediction problem is binary classification: trying to discriminate between objects that belong to one of two categories—positive (+1) or negative (−1). SVMs use two key concepts to solve this problem: large margin separation and kernel functions. The idea of large margin separation can be motivated by classification of points in two dimensions (see Figure 1). A simple way to classify the points is to draw a straight line and call points lying on one side positive and on the other side negative. If the two sets are well separated, one would intuitively draw the separating line such that it is as far as possible away from the points in both sets (see Figures 2 and and3).3). This intuitive choice captures the idea of large margin separation, which is mathematically formulated in the section Classification with Large Margin. Open in a separate window Figure 1 A linear classifier separating two classes of points (squares and circles) in two dimensions. The decision boundary divides the space into two sets depending on the sign of f(x) = 〈w,x〉+b. The grayscale level represents the value of the discriminant function f(x): dark for low values and a light shade for high values.

...read moreread less

660 citations

Proceedings Article•DOI•

Confidence-weighted linear classification

[...]

Mark Dredze¹, Koby Crammer¹, Fernando Pereira²•Institutions (2)

University of Pennsylvania¹, Google²

05 Jul 2008

TL;DR: Empirical evaluation on a range of NLP tasks show that the confidence-weighted linear classifiers introduced here improves over other state of the art online and batch methods, learns faster in the online setting, and lends itself to better classifier combination after parallel training.

...read moreread less

Abstract: We introduce confidence-weighted linear classifiers, which add parameter confidence information to linear classifiers. Online learners in this setting update both classifier parameters and the estimate of their confidence. The particular online algorithms we study here maintain a Gaussian distribution over parameter vectors and update the mean and covariance of the distribution with each instance. Empirical evaluation on a range of NLP tasks show that our algorithm improves over other state of the art online and batch methods, learns faster in the online setting, and lends itself to better classifier combination after parallel training.

...read moreread less

433 citations

Journal Article•DOI•

Classification with a Reject Option using a Hinge Loss

[...]

Peter L. Bartlett, Marten Wegkamp

01 Jun 2008-Journal of Machine Learning Research

TL;DR: This work considers the problem of binary classification where the classifier can, for a particular cost, choose not to classify an observation and proposes a certain convex loss function φ, analogous to the hinge loss used in support vector machines (SVMs).

...read moreread less

Abstract: We consider the problem of binary classification where the classifier can, for a particular cost, choose not to classify an observation. Just as in the conventional classification problem, minimization of the sample average of the cost is a difficult optimization problem. As an alternative, we propose the optimization of a certain convex loss function φ, analogous to the hinge loss used in support vector machines (SVMs). Its convexity ensures that the sample average of this surrogate loss can be efficiently minimized. We study its statistical properties. We show that minimizing the expected surrogate lossthe φ-riskalso minimizes the risk. We also study the rate at which the φ-risk approaches its minimum value. We show that fast rates are possible when the conditional probability P(Y=1|X) is unlikely to be close to certain critical values.

...read moreread less

409 citations

Book•

Pattern Classification: A Unified View of Statistical and Neural Approaches

[...]

Jürgen Schürmann¹•Institutions (1)

Daimler AG¹

02 May 2008

TL;DR: In this article, a classification based on statistical models determined by First-and Second Order Statistical Moments is proposed, which is based on Mean-Square Functional Approximations (MFFA).

...read moreread less

Abstract: Statistical Decision Theory. Need for Approximations: Fundamental Approaches. Classification Based on Statistical Models Determined by First-and-Second Order Statistical Moments. Classification Based on Mean-Square Functional Approximations. Polynomial Regression. Multilayer Perceptron Regression. Radial Basis Functions. Measurements, Features, and Feature Section. Reject Criteria and Classifier Performance. Combining Classifiers. Conclusion. STATMOD Program: Description of ftp Package. References. Index.

...read moreread less

383 citations

Journal Article•DOI•

Sparse estimation automatically selects voxels relevant for the decoding of fMRI activity patterns.

[...]

Okito Yamashita, Masaaki Sato, Taku Yoshioka, Frank Tong¹, Yukiyasu Kamitani - Show less +1 more•Institutions (1)

Vanderbilt University¹

01 Oct 2008-NeuroImage

TL;DR: It is concluded that SLR provides a robust method for fMRI decoding and can also serve as a stand-alone tool for voxel selection, by exploiting correlated noise among voxels to allow for better pattern separation.

...read moreread less

372 citations

Journal Article•DOI•

Parameter determination of support vector machine and feature selection using simulated annealing approach

[...]

Shih-Wei Lin¹, Zne-Jung Lee², Shih-Chieh Chen³, Tsung-Yuan Tseng²•Institutions (3)

Chang Gung University¹, Huafan University², National Taiwan University of Science and Technology³

01 Sep 2008

TL;DR: Experimental results indicate that the classification accuracy rates of the proposed approach exceed those of grid search and other approaches, and the SA-SVM is thus useful for parameter determination and feature selection in the SVM.

...read moreread less

Abstract: Support vector machine (SVM) is a novel pattern classification method that is valuable in many applications. Kernel parameter setting in the SVM training process, along with the feature selection, significantly affects classification accuracy. The objective of this study is to obtain the better parameter values while also finding a subset of features that does not degrade the SVM classification accuracy. This study develops a simulated annealing (SA) approach for parameter determination and feature selection in the SVM, termed SA-SVM. To measure the proposed SA-SVM approach, several datasets in UCI machine learning repository are adopted to calculate the classification accuracy rate. The proposed approach was compared with grid search which is a conventional method of performing parameter setting, and various other methods. Experimental results indicate that the classification accuracy rates of the proposed approach exceed those of grid search and other approaches. The SA-SVM is thus useful for parameter determination and feature selection in the SVM.

...read moreread less

334 citations

Journal Article•DOI•

WND-CHARM: Multi-purpose image classification using compound image transforms

[...]

Nikita Orlov, Lior Shamir, Tomasz Macura¹, Josiah Johnston, D. Mark Eckley, Ilya G. Goldberg - Show less +2 more•Institutions (1)

University of Cambridge¹

01 Aug 2008-Pattern Recognition Letters

TL;DR: A multi-purpose image classifier that can be applied to a wide variety of image classification tasks without modifications or fine-tuning, and yet provide classification accuracy comparable to state-of-the-art task-specific image classifiers.

...read moreread less

280 citations

Journal Article•DOI•

Multispectral landuse classification using neural networks and support vector machines: one or the other, or both?

[...]

Barnali M. Dixon¹, N. Candade¹•Institutions (1)

University of South Florida¹

15 Feb 2008-Journal of remote sensing

TL;DR: This research involves the study and implementation of a new pattern recognition technique introduced within the framework of statistical learning theory called Support Vector Machines (SVMs), and its application to remote‐sensing image classification.

...read moreread less

Abstract: Land use classification is an important part of many remote sensing applications. A lot of research has gone into the application of statistical and neural network classifiers to remote-sensing images. This research involves the study and implementation of a new pattern recognition technique introduced within the framework of statistical learning theory called Support Vector Machines (SVMs), and its application to remote-sensing image classification. Standard classifiers such as Artificial Neural Network (ANN) need a number of training samples that exponentially increase with the dimension of the input feature space. With a limited number of training samples, the classification rate thus decreases as the dimensionality increases. SVMs are independent of the dimensionality of feature space as the main idea behind this classification technique is to separate the classes with a surface that maximizes the margin between them, using boundary pixels to create the decision surface. Results from SVMs are compared with traditional Maximum Likelihood Classification (MLC) and an ANN classifier. The findings suggest that the ANN and SVM classifiers perform better than the traditional MLC. The SVM and the ANN show comparable results. However, accuracy is dependent on factors such as the number of hidden nodes (in the case of ANN) and kernel parameters (in the case of SVM). The training time taken by the SVM is several magnitudes less.

...read moreread less

276 citations

Journal Article•

Coordinate Descent Method for Large-scale L2-loss Linear Support Vector Machines

[...]

Kai-Wei Chang¹, Cho-Jui Hsieh¹, Chih-Jen Lin¹•Institutions (1)

National Taiwan University¹

01 Jun 2008-Journal of Machine Learning Research

TL;DR: A novel coordinate descent algorithm for training linear SVM with the L2-loss function that is more efficient and stable than state of the art methods such as Pegasos and TRON.

...read moreread less

Abstract: Linear support vector machines (SVM) are useful for classifying large-scale sparse data. Problems with sparse features are common in applications such as document classification and natural language processing. In this paper, we propose a novel coordinate descent algorithm for training linear SVM with the L2-loss function. At each step, the proposed method minimizes a one-variable sub-problem while fixing other variables. The sub-problem is solved by Newton steps with the line search technique. The procedure globally converges at the linear rate. As each sub-problem involves only values of a corresponding feature, the proposed approach is suitable when accessing a feature is more convenient than accessing an instance. Experiments show that our method is more efficient and stable than state of the art methods such as Pegasos and TRON.

...read moreread less

257 citations

Journal Article•DOI•

Penalized feature selection and classification in bioinformatics

[...]

Shuangge Ma¹, Jian Huang²•Institutions (2)

Yale University¹, University of Washington²

01 Sep 2008-Briefings in Bioinformatics

TL;DR: This article provides a review of several recently developed penalized feature selection and classification techniques--which belong to the family of embedded feature selection methods--for bioinformatics studies with high-dimensional input.

...read moreread less

Abstract: In bioinformatics studies, supervised classification with high-dimensional input variables is frequently encountered. Examples routinely arise in genomic, epigenetic and proteomic studies. Feature selection can be employed along with classifier construction to avoid over-fitting, to generate more reliable classifier and to provide more insights into the underlying causal relationships. In this article, we provide a review of several recently developed penalized feature selection and classification techniques—which belong to the family of embedded feature selection methods—for bioinformatics studies with high-dimensional input. Classification objective functions, penalty functions and computational algorithms are discussed. Our goal is to make interested researchers aware of these feature selection and classification methods that are applicable to high-dimensional bioinformatics data.

...read moreread less

Journal Article•DOI•

A review on the combination of binary classifiers in multiclass problems

[...]

Ana Carolina Lorena¹, André C. P. L. F. de Carvalho², João Gama³•Institutions (3)

Universidade Federal do ABC¹, Spanish National Research Council², University of Porto³

01 Dec 2008-Artificial Intelligence Review

TL;DR: This paper presents a survey on the main strategies for the generalization of binary classifiers to problems with more than two classes, known as multiclass classification problems, and focuses on strategies that decompose the original multiclass problem into multiple binary subtasks, whose outputs are combined to obtain the final prediction.

...read moreread less

Abstract: Several real problems involve the classification of data into categories or classes. Given a data set containing data whose classes are known, Machine Learning algorithms can be employed for the induction of a classifier able to predict the class of new data from the same domain, performing the desired discrimination. Some learning techniques are originally conceived for the solution of problems with only two classes, also named binary classification problems. However, many problems require the discrimination of examples into more than two categories or classes. This paper presents a survey on the main strategies for the generalization of binary classifiers to problems with more than two classes, known as multiclass classification problems. The focus is on strategies that decompose the original multiclass problem into multiple binary subtasks, whose outputs are combined to obtain the final prediction.

...read moreread less

Journal Article•DOI•

Fisher discriminative analysis of resting-state brain function for attention-deficit/hyperactivity disorder.

[...]

Chaozhe Zhu¹, Yu-Feng Zang², Yu-Feng Zang¹, Qingjiu Cao³, Chao-Gan Yan¹, Yong He⁴, Tianzi Jiang², Manqiu Sui³, Yufeng Wang³ - Show less +5 more•Institutions (4)

Beijing Normal University¹, Chinese Academy of Sciences², Peking University³, Montreal Neurological Institute and Hospital⁴

01 Mar 2008-NeuroImage

TL;DR: It is concluded that the classifier, using resting-state brain function as classification feature, has potential ability to improve current diagnosis and treatment evaluation of ADHD.

...read moreread less

Journal Article•DOI•

Monte Carlo feature selection for supervised classification

[...]

Michał Dramiński¹, Alvaro Rada-Iglesias², Stefan Enroth², Claes Wadelius², Jacek Koronacki², Jan Komorowski² - Show less +2 more•Institutions (2)

Polish Academy of Sciences¹, Uppsala University²

01 Jan 2008-Bioinformatics

TL;DR: Biological interpretation of the genes selected by this conceptually simple but computer-intensive approach to pre-selection of informative features for supervised classification showed that several of them are involved in precursors to different types of leukemia and lymphoma rather than being genes that are common to several forms of cancers, which is the case for the other methods.

...read moreread less

Abstract: Motivation: Pre-selection of informative features for supervised classification is a crucial, albeit delicate, task. It is desirable that feature selection provides the features that contribute most to the classification task per se and which should therefore be used by any classifier later used to produce classification rules. In this article, a conceptually simple but computer-intensive approach to this task is proposed. The reliability of the approach rests on multiple construction of a tree classifier for many training sets randomly chosen from the original sample set, where samples in each training set consist of only a fraction of all of the observed features. Results: The resulting ranking of features may then be used to advantage for classification via a classifier of any type. The approach was validated using Golub et al. leukemia data and the Alizadeh et al. lymphoma data. Not surprisingly, we obtained a significantly different list of genes. Biological interpretation of the genes selected by our method showed that several of them are involved in precursors to different types of leukemia and lymphoma rather than being genes that are common to several forms of cancers, which is the case for the other methods. Availability: Prototype available upon request. Contact: jan.komorowski@lcb.uu.se

...read moreread less

Journal Article•DOI•

LIBLINEAR: A Library for Large Linear Classification

[...]

FanRong-En, ChangKai-Wei, HsiehCho-Jui, WangXiang-Rui, LinChih-Jen - Show less +1 more

01 Jun 2008-Journal of Machine Learning Research

TL;DR: LIBLINEAR as discussed by the authors is an open source library for large-scale linear classification that supports logistic regression and linear support vector machines, and provides easy-to-use command-line tools and library support.

...read moreread less

Journal Article•DOI•

Higher criticism thresholding: Optimal feature selection when useful features are rare and weak.

[...]

David L. Donoho¹, Jiashun Jin²•Institutions (2)

Stanford University¹, Carnegie Mellon University²

30 Sep 2008-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: In the most challenging RW settings, HCT uses an unconventionally low threshold, which keeps the missed-feature detection rate under better control than FDRT and yields a classifier with improved misclassification performance.

...read moreread less

Abstract: In important application fields today-genomics and proteomics are examples-selecting a small subset of useful features is crucial for success of Linear Classification Analysis. We study feature selection by thresholding of feature Z-scores and introduce a principle of threshold selection, based on the notion of higher criticism (HC). For i = 1, 2, ..., p, let pi(i) denote the two-sided P-value associated with the ith feature Z-score and pi((i)) denote the ith order statistic of the collection of P-values. The HC threshold is the absolute Z-score corresponding to the P-value maximizing the HC objective (i/p - pi((i)))/sqrt{i/p(1-i/p)}. We consider a rare/weak (RW) feature model, where the fraction of useful features is small and the useful features are each too weak to be of much use on their own. HC thresholding (HCT) has interesting behavior in this setting, with an intimate link between maximizing the HC objective and minimizing the error rate of the designed classifier, and very different behavior from popular threshold selection procedures such as false discovery rate thresholding (FDRT). In the most challenging RW settings, HCT uses an unconventionally low threshold; this keeps the missed-feature detection rate under better control than FDRT and yields a classifier with improved misclassification performance. Replacing cross-validated threshold selection in the popular Shrunken Centroid classifier with the computationally less expensive and simpler HCT reduces the variance of the selected threshold and the error rate of the constructed classifier. Results on standard real datasets and in asymptotic theory confirm the advantages of HCT.

...read moreread less

On the relationship between feature selection and classification accuracy

[...]

Andreas Janecek¹, Wilfried N. Gansterer¹, Michael A. Demel¹, Gerhard F. Ecker¹•Institutions (1)

University of Vienna¹

15 Sep 2008

TL;DR: In this article, the authors investigate the relationship between several attribute space reduction techniques and the resulting classification accuracy for two very different application areas, e.g., e-mail filtering and drug discovery.

...read moreread less

Abstract: Dimensionality reduction and feature subset selection are two techniques for reducing the attribute space of a feature set, which is an important component of both supervised and unsupervised classification or regression problems. While in feature subset selection a subset of the original attributes is extracted, dimensionality reduction in general produces linear combinations of the original attribute set. In this paper we investigate the relationship between several attribute space reduction techniques and the resulting classification accuracy for two very different application areas. On the one hand, we consider e-mail filtering, where the feature space contains various properties of e-mail messages, and on the other hand, we consider drug discovery problems, where quantitative representations of molecular structures are encoded in terms of information-preserving descriptor values. Subsets of the original attributes constructed by filter and wrapper techniques as well as subsets of linear combinations of the original attributes constructed by three different variants of the principle component analysis (PCA) are compared in terms of the classification performance achieved with various machine learning algorithms as well as in terms of runtime performance. We successively reduce the size of the attribute sets and investigate the changes in the classification results. Moreover, we explore the relationship between the variance captured in the linear combinations within PCA and the resulting classification accuracy. The results show that the classification accuracy based on PCA is highly sensitive to the type of data and that the variance captured the principal components is not necessarily a vital indicator for the classification performance.

...read moreread less

Proceedings Article•DOI•

Automatic record linkage using seeded nearest neighbour and support vector machine classification

[...]

Peter Christen¹•Institutions (1)

Australian National University¹

24 Aug 2008

TL;DR: Two variations of a novel two-step approach to automatic record pair classification based on a nearest-neighbour classifier are presented, while the second improves a SVM classifier by iteratively adding more examples into the training sets.

...read moreread less

Abstract: The task of linking databases is an important step in an increasing number of data mining projects, because linked data can contain information that is not available otherwise, or that would require time-consuming and expensive collection of specific data The aim of linking is to match and aggregate all records that refer to the same entity One of the major challenges when linking large databases is the efficient and accurate classification of record pairs into matches and non-matches While traditionally classification was based on manually-set thresholds or on statistical procedures, many of the more recently developed classification methods are based on supervised learning techniques They therefore require training data, which is often not available in real world situations or has to be prepared manually, an expensive, cumbersome and time-consuming processThe author has previously presented a novel two-step approach to automatic record pair classification [6, 7] In the first step of this approach, training examples of high quality are automatically selected from the compared record pairs, and used in the second step to train a support vector machine (SVM) classifier Initial experiments showed the feasibility of the approach, achieving results that outperformed k-means clustering In this paper, two variations of this approach are presented The first is based on a nearest-neighbour classifier, while the second improves a SVM classifier by iteratively adding more examples into the training sets Experimental results show that this two-step approach can achieve better classification results than other unsupervised approaches

...read moreread less

Journal Article•DOI•

An Objective Analysis of Support Vector Machine Based Classification for Remote Sensing

[...]

Thomas Oommen¹, Debasmita Misra², Navin K. C. Twarakavi³, Anupma Prakash², Bhaskar Sahoo², Sukumar Bandopadhyay² - Show less +2 more•Institutions (3)

Tufts University¹, University of Alaska Fairbanks², University of California, Riverside³

14 Mar 2008-Mathematical Geosciences

TL;DR: A comparative analysis of SVC with the Maximum Likelihood Classification (MLC) method, which is the most popular conventional supervised classification technique, illustrated that SVC improved the classification accuracy, was robust and did not suffer from dimensionality issues such as the Hughes Effect.

...read moreread less

Abstract: Accurate thematic classification is one of the most commonly desired outputs from remote sensing images Recent research efforts to improve the reliability and accuracy of image classification have led to the introduction of the Support Vector Classification (SVC) scheme SVC is a new generation of supervised learning method based on the principle of statistical learning theory, which is designed to decrease uncertainty in the model structure and the fitness of data We have presented a comparative analysis of SVC with the Maximum Likelihood Classification (MLC) method, which is the most popular conventional supervised classification technique SVC is an optimization technique in which the classification accuracy heavily relies on identifying the optimal parameters Using a case study, we verify a method to obtain these optimal parameters such that SVC can be applied efficiently We use multispectral and hyperspectral images to develop thematic classes of known lithologic units in order to compare the classification accuracy of both the methods We have varied the training to testing data proportions to assess the relative robustness and the optimal training sample requirement of both the methods to achieve comparable levels of accuracy The results of our study illustrated that SVC improved the classification accuracy, was robust and did not suffer from dimensionality issues such as the Hughes Effect

...read moreread less

Book Chapter•DOI•

Support Vector Machine Classification for Object-Based Image Analysis

[...]

Angelos Tzotsos¹, Demetre Argialas¹•Institutions (1)

National Technical University of Athens¹

01 Jan 2008

TL;DR: The objective of this study was to evaluate SVMs for their effectiveness and prospects for object-based image analysis as a modern computational intelligence method and the SVM methodology seems very promising for Object Based Image Analysis.

...read moreread less

Abstract: The Support Vector Machine is a theoretically superior machine learning methodology with great results in pattern recognition. Especially for supervised classification of high-dimensional datasets and has been found competitive with the best machine learning algorithms. In the past, SVMs were tested and evaluated only as pixel-based image classifiers. During recent years, advances in Remote Sensing occurred in the field of Object-Based Image Analysis (OBIA) with combination of low level and high level computer vision techniques. Moving from pixel-based techniques towards object-based representation, the dimensions of remote sensing imagery feature space increases significantly. This results to increased complexity of the classification process, and causes problems to traditional classification schemes. The objective of this study was to evaluate SVMs for their effectiveness and prospects for object-based image analysis as a modern computational intelligence method. Here, an SVM approach for multi-class classification was followed, based on primitive image objects provided by a multi-resolution segmentation algorithm. Then, a feature selection step took place in order to provide the features for classification which involved spectral, texture and shape information. After the feature selection step, a module that integrated an SVM classifier and the segmentation algorithm was developed in C++. For training the SVM, sample image objects derived from the segmentation procedure were used. The proposed classification procedure followed, resulting in the final object classification. The classification results were compared to the Nearest Neighbor object-based classifier results, and were found satisfactory. The SVM methodology seems very promising for Object Based Image Analysis and future work will focus on integrating SVM classifiers with rule-based classifiers.

...read moreread less

Journal Article•DOI•

Wavelet support vector machine for induction machine fault diagnosis based on transient current signal

[...]

Achmad Widodo¹, Bo-Suk Yang¹•Institutions (1)

Pukyong National University¹

01 Jul 2008-Expert Systems With Applications

TL;DR: A relatively new intelligent faults detection and classification method called W-SVM is established that is used to induction motor for faults classification based on transient current signal and the results show that the performance of classification has high accuracy.

...read moreread less

Abstract: This paper presents establishing intelligent system for faults detection and classification of induction motor using wavelet support vector machine (W-SVM). Support vector machines (SVM) is well known as intelligent classifier with strong generalization ability. Application of nonlinear SVM using kernel function is widely used for multi-class classification procedure. In this paper, building kernel function using wavelet will be introduced and applied for SVM multi-class classifier. Moreover, the feature vectors for training classification routine are obtained from transient current signal that preprocessed by discrete wavelet transform. In this work, principal component analysis (PCA) and kernel PCA are performed to reduce the dimension of features and to extract the useful features for classification process. Hence, a relatively new intelligent faults detection and classification method called W-SVM is established. This method is used to induction motor for faults classification based on transient current signal. The results show that the performance of classification has high accuracy based on experimental work.

...read moreread less

Proceedings Article•DOI•

A genetic algorithm based wrapper feature selection method for classification of hyperspectral images using support vector machine

[...]

Li Zhuo¹, Jing Zheng, Xia Li¹, Fang Wang², Fang Wang¹, Bin Ai¹, Jun-ping Qian¹ - Show less +3 more•Institutions (2)

Sun Yat-sen University¹, Guangzhou University²

31 Oct 2008

TL;DR: The proposed wrapper feature selection method GA-SVM can optimize feature subsets and SVM kernel parameters at the same time, therefore can be applied in feature selection of the hyper spectral data.

...read moreread less

Abstract: The high-dimensional feature vectors of hyper spectral data often impose a high computational cost as well as the risk of "over fitting" when classification is performed. Therefore it is necessary to reduce the dimensionality through ways like feature selection. Currently, there are two kinds of feature selection methods: filter methods and wrapper methods. The former kind requires no feedback from classifiers and estimates the classification performance indirectly. The latter kind evaluates the "goodness" of selected feature subset directly based on the classification accuracy. Many experimental results have proved that the wrapper methods can yield better performance, although they have the disadvantage of high computational cost. In this paper, we present a Genetic Algorithm (GA) based wrapper method for classification of hyper spectral data using Support Vector Machine (SVM), a state-of-art classifier that has found success in a variety of areas. The genetic algorithm (GA), which seeks to solve optimization problems using the methods of evolution, specifically survival of the fittest, was used to optimize both the feature subset, i.e. band subset, of hyper spectral data and SVM kernel parameters simultaneously. A special strategy was adopted to reduce computation cost caused by the high-dimensional feature vectors of hyper spectral data when the feature subset part of chromosome was designed. The GA-SVM method was realized using the ENVI/IDL language, and was then tested by applying to a HYPERION hyper spectral image. Comparison of the optimized results and the un-optimized results showed that the GA-SVM method could significantly reduce the computation cost while improving the classification accuracy. The number of bands used for classification was reduced from 198 to 13, while the classification accuracy increased from 88.81% to 92.51%. The optimized values of the two SVM kernel parameters were 95.0297 and 0.2021, respectively, which were different from the default values as used in the ENVI software. In conclusion, the proposed wrapper feature selection method GA-SVM can optimize feature subsets and SVM kernel parameters at the same time, therefore can be applied in feature selection of the hyper spectral data.

...read moreread less

Journal Article•DOI•

Risk-sensitive loss functions for sparse multi-category classification problems

[...]

Sundaram Suresh¹, Narasimhan Sundararajan¹, Paramasivan Saratchandran¹•Institutions (1)

Nanyang Technological University¹

20 Jun 2008-Information Sciences

TL;DR: The proposed risk-sensitive loss functions minimize both the approximation and estimation error and indicate the superior performance of the neural classifier using the proposed loss functions both in terms of the overall and per class classification accuracy.

...read moreread less

Journal Article•DOI•

A General Wrapper Approach to Selection of Class-Dependent Features

[...]

Lipo Wang¹, Nina Zhou¹, Feng Chu¹•Institutions (1)

Nanyang Technological University¹

01 Jul 2008-IEEE Transactions on Neural Networks

TL;DR: The results indicate that the class-dependent feature subsets found by the proposed weight method can effectively remove irrelevant or redundant features, while maintaining or improving (sometimes substantially ) the classification accuracy, in comparison with other feature selection methods.

...read moreread less

Abstract: In this paper, we argue that for a C-class classification problem, C 2-class classifiers, each of which discriminating one class from the other classes and having a characteristic input feature subset, should in general outperform, or at least match the performance of, a C-class classifier with one single input feature subset. For each class, we select a desirable feature subset, which leads to the lowest classification error rate for this class using a classifier for a given feature subset search algorithm. To fairly compare all models, we propose a weight method for the class-dependent classifier, i.e., assigning a weight to each model's output before the comparison is carried out. The method's performance is evaluated on two artificial data sets and several real-world benchmark data sets, with the support vector machine (SVM) as the classifier , and with the RELIEF, class separability, and minimal-redundancy-maximal-relevancy (mRMR) as attribute importance measures. Our results indicate that the class-dependent feature subsets found by our approach can effectively remove irrelevant or redundant features, while maintaining or improving (sometimes substantially ) the classification accuracy, in comparison with other feature selection methods.

...read moreread less

Proceedings Article•DOI•

Feature Extraction of EEG Signals Using Power Spectral Entropy

[...]

Aihua Zhang¹, Bin Yang¹, Ling Huang¹•Institutions (1)

Lanzhou University of Technology¹

27 May 2008

TL;DR: An approach that performs EEG feature extraction during imagined right and left hand movements by using power spectral entropy (PSE) acquires good classification results with the time-variable linear classifier and provides a promising method for on-line BCI system.

...read moreread less

Abstract: Brain-Computer Interfaces (BCI) use electroencephalography (EEG) signals recorded from the scalp to create a new communication channel between the brain and an output device by bypassing conventional motor output pathways of nerves and muscles. One of the most important components of BCI is feature extraction of EEG signals. How to rapidly and reliably extract EEG features for expressing the brain states of different mental tasks is the crucial element for exact classification. This paper presents an approach that performs EEG feature extraction during imagined right and left hand movements by using power spectral entropy (PSE). It acquires good classification results with the time-variable linear classifier. The maximal accuracy achieves 90%. The results show that the PSE is a sensitive parameter for EEG of imaginary hand movements. The method is simple and quick and it provides a promising method for on-line BCI system.

...read moreread less

Journal Article•DOI•

Subclass Problem-Dependent Design for Error-Correcting Output Codes

[...]

Sergio Escalera, David M. J. Tax, Oriol Pujol, Petia Radeva, Robert P. W. Duin - Show less +1 more

01 Jun 2008-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A novel strategy to model multiclass classification problems using subclass information in the ECOC framework is presented and it is shown that the proposed splitting procedure yields a better performance when the class overlap or the distribution of the training objects conceal the decision boundaries for the base classifier.

...read moreread less

Abstract: A common way to model multiclass classification problems is by means of Error-Correcting Output Codes (ECOCs). Given a multiclass problem, the ECOC technique designs a code word for each class, where each position of the code identifies the membership of the class for a given binary problem. A classification decision is obtained by assigning the label of the class with the closest code. One of the main requirements of the ECOC design is that the base classifier is capable of splitting each subgroup of classes from each binary problem. However, we cannot guarantee that a linear classifier model convex regions. Furthermore, nonlinear classifiers also fail to manage some type of surfaces. In this paper, we present a novel strategy to model multiclass classification problems using subclass information in the ECOC framework. Complex problems are solved by splitting the original set of classes into subclasses and embedding the binary problems in a problem-dependent ECOC design. Experimental results show that the proposed splitting procedure yields a better performance when the class overlap or the distribution of the training objects conceal the decision boundaries for the base classifier. The results are even more significant when one has a sufficiently large training size.

...read moreread less

Journal Article•DOI•

A DC programming approach for feature selection in support vector machines learning

[...]

Hoai An Le Thi¹, Hoai Minh Le¹, Van Vinh Nguyen¹, Tao Pham Dinh•Institutions (1)

Metz¹

23 Nov 2008-Advanced Data Analysis and Classification

TL;DR: In this article, a robust feature selection method using the zero-norm l 0 in the context of support vector machines (SVMs) is proposed, which has a finite convergence and requires solving one linear program at each iteration.

...read moreread less

Abstract: Feature selection consists of choosing a subset of available features that capture the relevant properties of the data. In supervised pattern classification, a good choice of features is fundamental for building compact and accurate classifiers. In this paper, we develop an efficient feature selection method using the zero-norm l 0 in the context of support vector machines (SVMs). Discontinuity at the origin for l 0 makes the solution of the corresponding optimization problem difficult to solve. To overcome this drawback, we use a robust DC (difference of convex functions) programming approach which is a general framework for non-convex continuous optimisation. We consider an appropriate continuous approximation to l 0 such that the resulting problem can be formulated as a DC program. Our DC algorithm (DCA) has a finite convergence and requires solving one linear program at each iteration. Computational experiments on standard datasets including challenging feature-selection problems of the NIPS 2003 feature selection challenge and gene selection for cancer classification show that the proposed method is promising: while it suppresses up to more than 99% of the features, it can provide a good classification. Moreover, the comparative results illustrate the superiority of the proposed approach over standard methods such as classical SVMs and feature selection concave.

...read moreread less

Journal Article•DOI•

Assessment of a non-invasive high-throughput classifier for behaviours associated with sleep and wake in mice

[...]

Kevin D. Donohue¹, Dharshan C Medonza¹, Eli Crane¹, Bruce F. O'Hara¹•Institutions (1)

University of Kentucky¹

11 Apr 2008-Biomedical Engineering Online

TL;DR: A linear classifier based on robust features extracted from normalized power spectra and autocorrelation functions, as well as novel features from the collapsed average, which characterize transient and periodic properties of the signal envelope are developed.

...read moreread less

Abstract: This work presents a non-invasive high-throughput system for automatically detecting characteristic behaviours in mice over extended periods of time, useful for phenotyping experiments. The system classifies time intervals on the order of 2 to 4 seconds as corresponding to motions consistent with either active wake or inactivity associated with sleep. A single Polyvinylidine Difluoride (PVDF) sensor on the cage floor generates signals from motion resulting in pressure. This paper develops a linear classifier based on robust features extracted from normalized power spectra and autocorrelation functions, as well as novel features from the collapsed average (autocorrelation of complex spectrum), which characterize transient and periodic properties of the signal envelope. Performance is analyzed through an experiment comparing results from direct human observation and classification of the different behaviours with an automatic classifier used in conjunction with this system. Experimental results from over 28.5 hours of data from 4 mice indicate a 94% classification rate relative to the human observations. Examples of sequential classifications (2 second increments) over transition regions between sleep and wake behaviour are also presented to demonstrate robust performance to signal variation and explain performance limitations.

...read moreread less

Journal Article•DOI•

A genetic algorithm rule-based approach for land-cover classification

[...]

Ming-Hseng Tseng¹, Sheng-Jhe Chen², Gwo-Haur Hwang², Ming-Yu Shen•Institutions (2)

Chung Shan Medical University¹, Ling Tung University²

01 Mar 2008-Isprs Journal of Photogrammetry and Remote Sensing

TL;DR: A rule-based classifier derived from improved genetic algorithm approach is proposed to determine the knowledge rules for land-cover classification done automatically from remote sensing image datasets, and preliminary results indicate that the proposed GA rule- based approach for land -cover classification is promising.

...read moreread less

Abstract: Classification of land-cover information using remotely-sensed imagery is a challenging topic due to the complexity of landscapes and the spatial and spectral resolution of the images being used. Early studies of land-cover classification used statistical methods such as the maximum likelihood classifier. Recently, however, numerous studies have applied artificial intelligence techniques – for example, expert system, artificial neural networks and support vector machines – as alternatives to remotely-sensed image classification applications. There is a major drawback in applying these models that the user cannot readily realize the final rules. In this paper, a rule-based classifier derived from improved genetic algorithm approach is proposed to determine the knowledge rules for land-cover classification done automatically from remote sensing image datasets. The proposed algorithm is demonstrated for two image datasets classification problems. Results are compared to other approaches in the literatures. The preliminary results indicate that the proposed GA rule-based approach for land-cover classification is promising.

...read moreread less

Journal Article•DOI•

Multiple Instance Classification via Successive Linear Programming

[...]

Olvi L. Mangasarian¹, Edward W. Wild¹•Institutions (1)

University of Wisconsin-Madison¹

01 Jun 2008-Journal of Optimization Theory and Applications

TL;DR: A linearization algorithm is proposed that solves a succession of fast linear programs that converges in a few iterations to a local solution that is competitive with the considerably more complex integer programming and other formulations.

...read moreread less

Abstract: The multiple instance classification problem (Dietterich et al., Artif. Intell. 89:31–71, [1998]; Auer, Proceedings of 14th International Conference on Machine Learning, pp. 21–29, Morgan Kaufmann, San Mateo, [1997]; Long et al., Mach. Learn. 30(1):7–22, [1998]) is formulated using a linear or nonlinear kernel as the minimization of a linear function in a finite-dimensional (noninteger) real space subject to linear and bilinear constraints. A linearization algorithm is proposed that solves a succession of fast linear programs that converges in a few iterations to a local solution. Computational results on a number of datasets indicate that the proposed algorithm is competitive with the considerably more complex integer programming and other formulations. A distinguishing aspect of our linear classifier not shared by other multiple instance classifiers is the sparse number of features it utilizes. In some tasks, the reduction amounts to less than one percent of the original features.

...read moreread less

Collapse