Showing papers on "Statistical learning theory published in 2015"

PDF

Open Access

Book Chapter•DOI•

Support Vector Machines for Classification

[...]

01 Jan 2015

TL;DR: This chapter focuses on SVM for supervised classification tasks only, providing SVM formulations for when the input space is linearly separable or linearly nonseparable and when the data are unbalanced, along with examples.

...read moreread less

Abstract: This chapter covers details of the support vector machine (SVM) technique, a sparse kernel decision machine that avoids computing posterior probabilities when building its learning model. SVM offers a principled approach to problems because of its mathematical foundation in statistical learning theory. SVM constructs its solution in terms of a subset of the training input. SVM has been extensively used for classification, regression, novelty detection tasks, and feature reduction. This chapter focuses on SVM for supervised classification tasks only, providing SVM formulations for when the input space is linearly separable or linearly nonseparable and when the data are unbalanced, along with examples. The chapter also presents recent improvements to and extensions of the original SVM formulation. A case study concludes the chapter.

...read moreread less

221 citations

Journal Article•

Risk bounds for the majority vote: from a PAC-Bayesian analysis to a learning algorithm

[...]

Pascal Germain¹, Alexandre Lacasse¹, François Laviolette¹, Mario Marchand¹, Jean-Francis Roy¹ - Show less +1 more•Institutions (1)

Laval University¹

01 Jan 2015-Journal of Machine Learning Research

TL;DR: In this article, an extensive analysis of the behavior of majority votes in binary classification is presented, where the authors introduce a risk bound for majority votes, called the C-bound, that takes into account the average quality of the voters and their average disagreement.

...read moreread less

Abstract: We propose an extensive analysis of the behavior of majority votes in binary classification. In particular, we introduce a risk bound for majority votes, called the C-bound, that takes into account the average quality of the voters and their average disagreement. We also propose an extensive PAC-Bayesian analysis that shows how the C-bound can be estimated from various observations contained in the training data. The analysis intends to be self-contained and can be used as introductory material to PAC-Bayesian statistical learning theory. It starts from a general PAC-Bayesian perspective and ends with uncommon PAC-Bayesian bounds. Some of these bounds contain no Kullback-Leibler divergence and others allow kernel functions to be used as voters (via the sample compression setting). Finally, out of the analysis, we propose the MinCq learning algorithm that basically minimizes the C-bound. MinCq reduces to a simple quadratic program. Aside from being theoretically grounded, MinCq achieves state-of-the-art performance, as shown in our extensive empirical comparison with both AdaBoost and the Support Vector Machine.

...read moreread less

107 citations

Journal Article•DOI•

Learning with the maximum correntropy criterion induced losses for regression

[...]

Yunlong Feng¹, Xiaolin Huang¹, Lei Shi², Yuning Yang¹, Johan A. K. Suykens¹ - Show less +1 more•Institutions (2)

Katholieke Universiteit Leuven¹, Fudan University²

01 Jan 2015-Journal of Machine Learning Research

TL;DR: The focus in this paper is concerned with the connections between the regression model associated with the correntropy induced loss and the least squares regression model, and its convergence property.

...read moreread less

Abstract: Within the statistical learning framework, this paper studies the regression model associated with the correntropy induced losses. The correntropy, as a similarity measure, has been frequently employed in signal processing and pattern recognition. Motivated by its empirical successes, this paper aims at presenting some theoretical understanding towards the maximum correntropy criterion in regression problems. Our focus in this paper is two-fold: first, we are concerned with the connections between the regression model associated with the correntropy induced loss and the least squares regression model. Second, we study its convergence property. A learning theory analysis which is centered around the above two aspects is conducted. From our analysis, we see that the scale parameter in the loss function balances the convergence rates of the regression model and its robustness. We then make some efforts to sketch a general view on robust loss functions when being applied into the learning for regression problems. Numerical experiments are also implemented to verify the effectiveness of the model.

...read moreread less

102 citations

Journal Article•DOI•

Fast rates in statistical and online learning

[...]

Tim van Erven¹, Peter Grünwald², Nishant A. Mehta², Mark D. Reid³, Robert C. Williamson³ - Show less +1 more•Institutions (3)

Leiden University¹, Centrum Wiskunde & Informatica², Australian National University³

01 Jan 2015-Journal of Machine Learning Research

TL;DR: The central condition enables a direct proof of fast rates and its equivalence to the Bernstein condition is proved, itself a generalization of the Tsybakov margin condition, both of which have played a central role in obtaining fast rates in statistical learning.

...read moreread less

Abstract: The speed with which a learning algorithm converges as it is presented with more data is a central problem in machine learning -- a fast rate of convergence means less data is needed for the same level of performance. The pursuit of fast rates in online and statistical learning has led to the discovery of many conditions in learning theory under which fast learning is possible. We show that most of these conditions are special cases of a single, unifying condition, that comes in two forms: the central condition for 'proper' learning algorithms that always output a hypothesis in the given model, and stochastic mixability for online algorithms that may make predictions outside of the model. We show that under surprisingly weak assumptions both conditions are, in a certain sense, equivalent. The central condition has a re-interpretation in terms of convexity of a set of pseudoprobabilities, linking it to density estimation under misspecification. For bounded losses, we show how the central condition enables a direct proof of fast rates and we prove its equivalence to the Bernstein condition, itself a generalization of the Tsybakov margin condition, both of which have played a central role in obtaining fast rates in statistical learning. Yet, while the Bernstein condition is two-sided, the central condition is one-sided, making it more suitable to deal with unbounded losses. In its stochastic mixability form, our condition generalizes both a stochastic exp-concavity condition identified by Juditsky, Rigollet and Tsybakov and Vovk's notion of mixability. Our unifying conditions thus provide a substantial step towards a characterization of fast rates in statistical learning, similar to how classical mixability characterizes constant regret in the sequential prediction with expert advice setting.

...read moreread less

75 citations

Journal Article•DOI•

Support vector regression based determination of shear wave velocity

[...]

Parisa Bagheripour¹, Amin Gholami², Mojtaba Asoodeh¹, Mohsen Vaezzadeh-Asadi¹•Institutions (2)

Islamic Azad University¹, Petroleum University of Technology²

01 Jan 2015-Journal of Petroleum Science and Engineering

TL;DR: A comparison among SVR, neural network, and four well-known empirical correlations demonstrated SVR model outperformed other methods, and was successfully applied in one of carbonate reservoir rocks of Iran Gas-Fields.

...read moreread less

72 citations

Proceedings Article•DOI•

The research of the fast SVM classifier method

[...]

Yujun Yang¹, Jianping Li¹, Yimei Yang²•Institutions (2)

University of Electronic Science and Technology of China¹, Huaihua University²

01 Dec 2015

TL;DR: This paper presents a boundary detection technique for retaining the potential support vector that improves the learning generalization ability and achieves the minimization of empirical risk and confidence range in the case of small statistical sample size.

...read moreread less

Abstract: Support vector machine (SVM) is a machine learning method developed in the mid-1990s based on statistical learning theory. SVM classifier is currently more popular classifier. This paper presents a boundary detection technique for retaining the potential support vector. Through seeking to structural risk minimization of the SVM, it improves the learning generalization ability and achieves the minimization of empirical risk and confidence range in the case of small statistical sample size and it can also obtain the desired good statistical law.

...read moreread less

70 citations

Posted Content•

Risk Bounds for the Majority Vote: From a PAC-Bayesian Analysis to a Learning Algorithm

[...]

Pascal Germain¹, Alexandre Lacasse¹, François Laviolette¹, Mario Marchand¹, Jean-Francis Roy¹ - Show less +1 more•Institutions (1)

Laval University¹

28 Mar 2015-arXiv: Machine Learning

TL;DR: An extensive analysis of the behavior of majority votes in binary classification is proposed and a risk bound for majority votes, called the C-bound, is introduced that takes into account the average quality of the voters and their average disagreement.

...read moreread less

68 citations

Journal Article•DOI•

Lithology prediction by support vector classifiers using inverted seismic attributes data and petrophysical logs as a new approach and investigation of training data set size effect on its performance in a heterogeneous carbonate reservoir

[...]

Mohammad Ali Sebtosheikh, Ali Salehi¹•Institutions (1)

National Iranian South Oil Company¹

01 Oct 2015-Journal of Petroleum Science and Engineering

TL;DR: In this paper, support vector machines (SVMs) based on statistical learning theory (SLT) and the principles of structural risk minimization (SRM) and empirical risk minimisation (ERM) use an analytical approach to classification and regression.

...read moreread less

63 citations

Journal Article•

Minimax analysis of active learning

[...]

Steve Hanneke, Liu Yang¹•Institutions (1)

IBM¹

01 Jan 2015-Journal of Machine Learning Research

TL;DR: In this paper, the authors established distribution-free upper and lower bounds on the minimax label complexity of active learning with general hypothesis classes, under various noise models, revealing a number of surprising facts.

...read moreread less

Abstract: This work establishes distribution-free upper and lower bounds on the minimax label complexity of active learning with general hypothesis classes, under various noise models. The results reveal a number of surprising facts. In particular, under the noise model of Tsybakov (2004), the minimax label complexity of active learning with a VC class is always asymptotically smaller than that of passive learning, and is typically signi_cantly smaller than the best previously-published upper bounds in the active learning literature. In high-noise regimes, it turns out that all active learning problems of a given VC dimension have roughly the same minimax label complexity, which contrasts with well-known results for bounded noise. In low-noise regimes, we find that the label complexity is well-characterized by a simple combinatorial complexity measure we call the star number. Interestingly, we find that almost all of the complexity measures previously explored in the active learning literature have worst-case values exactly equal to the star number. We also propose new active learning strategies that nearly achieve these minimax label complexities.

...read moreread less

57 citations

Posted Content•

Efficient Learning of Linear Separators under Bounded Noise

[...]

Pranjal Awasthi¹, Maria-Florina Balcan², Nika Haghtalab², Ruth Urner³•Institutions (3)

Princeton University¹, Carnegie Mellon University², Max Planck Society³

12 Mar 2015-arXiv: Learning

TL;DR: In this article, the authors studied the learnability of linear separators in the presence of bounded (a.k.a. Massart) noise and provided the first polynomial time algorithm that can learn linear separator to arbitrarily small excess error in this noise model under the uniform distribution over the unit ball in

...read moreread less

Abstract: We study the learnability of linear separators in $\Re^d$ in the presence of bounded (a.k.a Massart) noise. This is a realistic generalization of the random classification noise model, where the adversary can flip each example $x$ with probability $\eta(x) \leq \eta$. We provide the first polynomial time algorithm that can learn linear separators to arbitrarily small excess error in this noise model under the uniform distribution over the unit ball in $\Re^d$, for some constant value of $\eta$. While widely studied in the statistical learning theory community in the context of getting faster convergence rates, computationally efficient algorithms in this model had remained elusive. Our work provides the first evidence that one can indeed design algorithms achieving arbitrarily small excess error in polynomial time under this realistic noise model and thus opens up a new and exciting line of research. We additionally provide lower bounds showing that popular algorithms such as hinge loss minimization and averaging cannot lead to arbitrarily small excess error under Massart noise, even under the uniform distribution. Our work instead, makes use of a margin based technique developed in the context of active learning. As a result, our algorithm is also an active learning algorithm with label complexity that is only a logarithmic the desired excess error $\epsilon$.

...read moreread less

47 citations

Journal Article•DOI•

Unsteady aerodynamic modeling at high angles of attack using support vector machines

[...]

Qing Wang¹, Weiqi Qian¹, Kaifeng He¹•Institutions (1)

China Aerodynamics Research and Development Center¹

01 Jun 2015-Chinese Journal of Aeronautics

TL;DR: The work presented here examines the feasibility of applying SVMs to high angle-of-attack unsteady aerodynamic modeling field and concludes that the least squares SVM models are in good agreement with the test data, which indicates the satisfying learning and generalization performance of LS-SVMs.

...read moreread less

Journal Article•

Online learning via sequential complexities

[...]

Alexander Rakhlin¹, Karthik Sridharan², Ambuj Tewari³•Institutions (3)

University of Pennsylvania¹, Cornell University², University of Michigan³

01 Jan 2015-Journal of Machine Learning Research

TL;DR: In this article, the authors consider the problem of sequential prediction and provide tools to study the minimax value of the associated game and provide necessary and sufficient conditions for online learnability in the setting of supervised learning.

...read moreread less

Abstract: We consider the problem of sequential prediction and provide tools to study the minimax value of the associated game. Classical statistical learning theory provides several useful complexity measures to study learning with i.i.d. data. Our proposed sequential complexities can be seen as extensions of these measures to the sequential setting. The developed theory is shown to yield precise learning guarantees for the problem of sequential prediction. In particular, we show necessary and sufficient conditions for online learnability in the setting of supervised learning. Several examples show the utility of our framework: we can establish learnability without having to exhibit an explicit online learning algorithm.

...read moreread less

Journal Article•DOI•

Application of smoothing techniques for linear programming twin support vector machines

[...]

Muhammad Tanveer¹•Institutions (1)

LNM Institute of Information Technology¹

01 Oct 2015-Knowledge and Information Systems

TL;DR: The proposed formulation leads to two smaller-sized unconstrained minimization problems having their objective functions piecewise differentiable, which reduces to solving just two systems of linear equations as opposed to solving two quadratic programming problems in TWSVM and TBSVM, which leads to extremely simple and fast algorithm.

...read moreread less

Abstract: In this paper, a new unconstrained minimization problem formulation is proposed for linear programming twin support vector machine (TWSVM) classifiers. The proposed formulation leads to two smaller-sized unconstrained minimization problems having their objective functions piecewise differentiable. However, since their objective functions contain the non-smooth "plus" function, two new smoothing approaches are assumed to solve the proposed formulation, and then apply Newton-Armijo algorithm. The idea of our formulation is to reformulate TWSVM as a strongly convex problem by incorporated regularization techniques and then derive smooth 1-norm linear programming formulation for TWSVM to improve robustness. One significant advantage of our proposed algorithm over TWSVM is that the structural risk minimization principle is implemented in the primal problems which embodies the marrow of statistical learning theory. In addition, the solution of two modified unconstrained minimization problems reduces to solving just two systems of linear equations as opposed to solving two quadratic programming problems in TWSVM and TBSVM, which leads to extremely simple and fast algorithm. Our approach has the advantage that a pair of matrix equation of order equals to the number of input examples is solved at each iteration of the algorithm. The algorithm converges from any starting point that can be easily implemented in MATLAB without using any optimization packages. The performance of our proposed method is verified experimentally on several benchmark and synthetic datasets. Experimental results show the effectiveness of our methods in both training time and classification accuracy.

...read moreread less

Journal Article•DOI•

Local Rademacher Complexity

[...]

Luca Oneto, Alessandro Ghio, Sandro Ridella, Davide Anguita

01 May 2015-Neural Networks

TL;DR: A new Local Rademacher Complexity risk bound is derived on the generalization ability of a model, which is able to take advantage of the availability of unlabeled samples, which improves state-of-the-art results even when no unlabeling samples are available.

...read moreread less

Proceedings Article•

Efficient Learning of Linear Separators under Bounded Noise

[...]

Pranjal Awasthi¹, Maria-Florina Balcan², Nika Haghtalab², Ruth Urner³•Institutions (3)

Princeton University¹, Carnegie Mellon University², Max Planck Society³

12 Mar 2015

TL;DR: In this article, the authors studied the learnability of linear separators in < in the presence of bounded (a.k.a. Massart) noise, and provided a polynomial time algorithm that can learn linear separator to arbitrarily small excess error in this noise model under the uniform distribution over the unit sphere in <, for some constant value of

...read moreread less

Abstract: We study the learnability of linear separators in < in the presence of bounded (a.k.a Massart) noise. This is a realistic generalization of the random classification noise model, where the adversary can flip each example x with probability η(x) ≤ η. We provide the first polynomial time algorithm that can learn linear separators to arbitrarily small excess error in this noise model under the uniform distribution over the unit sphere in <, for some constant value of η. While widely studied in the statistical learning theory community in the context of getting faster convergence rates, computationally efficient algorithms in this model had remained elusive. Our work provides the first evidence that one can indeed design algorithms achieving arbitrarily small excess error in polynomial time under this realistic noise model and thus opens up a new and exciting line of research. We additionally provide lower bounds showing that popular algorithms such as hinge loss minimization and averaging cannot lead to arbitrarily small excess error under Massart noise, even under the uniform distribution. Our work, instead, makes use of a margin based technique developed in the context of active learning. As a result, our algorithm is also an active learning algorithm with label complexity that is only logarithmic in the desired excess error .

...read moreread less

Journal Article•DOI•

Newton method for implicit Lagrangian twin support vector machines

[...]

Muhammad Tanveer¹•Institutions (1)

Nanyang Technological University¹

28 Aug 2015-International Journal of Machine Learning and Cybernetics

TL;DR: Experimental results show that the proposed implicit Lagrangian twin support vector machine (TWSVM) classifiers yields significantly better generalization performance in both computational time and classification accuracy.

...read moreread less

Abstract: In this paper, we proposed an implicit Lagrangian twin support vector machine (TWSVM) classifiers by formulating a pair of unconstrained minimization problems (UMPs) in dual variables whose solutions will be obtained using finite Newton method. The advantage of considering the generalized Hessian approach for our modified UMPs reduces to solving just two systems of linear equations as opposed to solving two quadratic programming problems in TWSVM and TBSVM, which leads to extremely simple and fast algorithm. Unlike the classical TWSVM and least square TWSVM (LSTWSVM), the structural risk minimization principle is implemented by adding regularization term in the primal problems of our proposed algorithm. This embodies the essence of statistical learning theory. Computational comparisons of our proposed method against GEPSVM, TWSVM, STWSVM and LSTWSVM have been made on both synthetic and well-known real world benchmark datasets. Experimental results show that our method yields significantly better generalization performance in both computational time and classification accuracy.

...read moreread less

Journal Article•DOI•

Classification of gaps at uncontrolled intersections and midblock crossings using support vector machines

[...]

Digvijay S. Pawar¹, Gopal R. Patil¹, Anita Chandrasekharan¹, Shruti Upadhyaya¹•Institutions (1)

Indian Institute of Technology Bombay¹

10 Dec 2015-Transportation Research Record

TL;DR: The feasibility of the SVM in analyzing gap acceptance is examined by comparing its results with existing statistical methods and it is found to be comparable with that of the BLM in all cases and better in a few.

...read moreread less

Abstract: Gap acceptance predictions provide very important inputs for performance evaluation and safety analysis of uncontrolled intersections and pedestrian midblock crossings. The focus of this paper is on the application of support vector machines (SVMs) in understanding and classifying gaps at these facilities. The SVMs are supervised learning techniques originating from statistical learning theory and are widely used for classification and regression. In this paper, the feasibility of the SVM in analyzing gap acceptance is examined by comparing its results with existing statistical methods. To accomplish that objective, SVM and binary logit models (BLMs) were developed and compared by using data collected at three types of uncontrolled intersections. SVM performance was found to be comparable with that of the BLM in all cases and better in a few. Also, the categorical statistics and skill scores used for validating gap acceptance data revealed that the SVM performed reasonably well. Thus, the SVM technique can be used to classify and predict accepted and rejected gap values according to speed and distance of oncoming vehicles. This technique can be used in advance safety warning systems for vehicles and pedestrians waiting to cross major stream vehicles.

...read moreread less

Proceedings Article•

Algorithmic stability and uniform generalization

[...]

Ibrahim M. Alabdulmohsin¹•Institutions (1)

King Abdullah University of Science and Technology¹

07 Dec 2015

TL;DR: This paper proves that algorithmic stability in the inference process is equivalent to uniform generalization across all parametric loss functions, and establishes a relationship between algorithmic Stability and the size of the observation space, which provides a formal justification for dimensionality reduction methods.

...read moreread less

Abstract: One of the central questions in statistical learning theory is to determine the conditions under which agents can learn from experience. This includes the necessary and sufficient conditions for generalization from a given finite training set to new observations. In this paper, we prove that algorithmic stability in the inference process is equivalent to uniform generalization across all parametric loss functions. We provide various interpretations of this result. For instance, a relationship is proved between stability and data processing, which reveals that algorithmic stability can be improved by post-processing the inferred hypothesis or by augmenting training examples with artificial noise prior to learning. In addition, we establish a relationship between algorithmic stability and the size of the observation space, which provides a formal justification for dimensionality reduction methods. Finally, we connect algorithmic stability to the size of the hypothesis space, which recovers the classical PAC result that the size (complexity) of the hypothesis space should be controlled in order to improve algorithmic stability and improve generalization.

...read moreread less

Journal Article•DOI•

Minimax analysis of active learning

[...]

HannekeSteve, YangLiu

01 Jan 2015-Journal of Machine Learning Research

TL;DR: This work establishes distribution-free upper and lower bounds on the minimax label complexity of active learning with general hypothesis classes, under various noise models.

...read moreread less

Unsteady aerodynamic modeling at high angles of attack using support vector machines

[...]

Wang, Qing, Qian, Weiqi, He, Kai-Feng - Show less +2 more

01 Jan 2015

TL;DR: In this paper, the feasibility of applying SVMs to high angle-of-attack unsteady aerodynamic modeling field is examined and several issues associated with use of SVMs are discussed in detail, such as selection of input variables, selection of output variables and determination of SVM parameters.

...read moreread less

Abstract: Accurate aerodynamic models are the basis of flight simulation and control law design.Mathematically modeling unsteady aerodynamics at high angles of attack bears great difficulties in model structure determination and parameter estimation due to little understanding of the flow mechanism.Support vector machines(SVMs)based on statistical learning theory provide a novel tool for nonlinear system modeling.The work presented here examines the feasibility of applying SVMs to high angle-of-attack unsteady aerodynamic modeling field.Mainly,after a review of SVMs,several issues associated with unsteady aerodynamic modeling by use of SVMs are discussed in detail,such as selection of input variables,selection of output variables and determination of SVM parameters.The least squares SVM(LS-SVM)models are set up from certain dynamic wind tunnel test data of a delta wing and an aircraft configuration,and then used to predict the aerodynamic responses in other tests.The predictions are in good agreement with the test data,which indicates the satisfying learning and generalization performance of LS-SVMs.

...read moreread less

Journal Article•DOI•

Global Nonlinear Kernel Prediction for Large Data Set With a Particle Swarm-Optimized Interval Support Vector Regression

[...]

Yongsheng Ding¹, Lijun Cheng¹, Witold Pedrycz², Kuangrong Hao¹•Institutions (2)

Chinese Ministry of Education¹, University of Alberta²

11 May 2015-IEEE Transactions on Neural Networks

TL;DR: The experimental results show that the proposed PSO-ISVR predictor can improve the computational efficiency and the overall prediction accuracy compared with the results produced by the SVR and other regression methods.

...read moreread less

Abstract: A new global nonlinear predictor with a particle swarm-optimized interval support vector regression (PSO-ISVR) is proposed to address three issues (viz., kernel selection, model optimization, kernel method speed) encountered when applying SVR in the presence of large data sets. The novel prediction model can reduce the SVR computing overhead by dividing input space and adaptively selecting the optimized kernel functions to obtain optimal SVR parameter by PSO. To quantify the quality of the predictor, its generalization performance and execution speed are investigated based on statistical learning theory. In addition, experiments using synthetic data as well as the stock volume weighted average price are reported to demonstrate the effectiveness of the developed models. The experimental results show that the proposed PSO-ISVR predictor can improve the computational efficiency and the overall prediction accuracy compared with the results produced by the SVR and other regression methods. The proposed PSO-ISVR provides an important tool for nonlinear regression analysis of big data.

...read moreread less

Journal Article•DOI•

Research on the hybrid models of granular computing and support vector machine

[...]

Shifei Ding¹, Huajuan Huang², Junzhao Yu², Han Zhao²•Institutions (2)

Chinese Academy of Sciences¹, China University of Mining and Technology²

01 Apr 2015-Artificial Intelligence Review

TL;DR: The hybrid models of granular computing and support vector machine are a kind of new machine learning algorithms based on granular Computing and statistical learning theory that can effectively use the advantage of each algorithm so that their performance are better than a single method.

...read moreread less

Abstract: The hybrid models of granular computing and support vector machine are a kind of new machine learning algorithms based on granular computing and statistical learning theory. These hybrid models can effectively use the advantage of each algorithm, so that their performance are better than a single method. In view of their excellent learning performance, the hybrid models of granular computing and support vector machine have become one of the focus at home and abroad. In this paper, the research on the hybrid models are reviewed, which include fuzzy support vector machine, rough support vector machine, quotient space support vector machine, rough fuzzy support vector machine and fuzzy rough support vector machine. Firstly, we briefly introduce the typical granular computing models and the basic theory of support vector machines. Secondly, we describe the latest progress of these hybrid models in recent years. Finally, we point out the research and development prospects of the hybrid algorithms.

...read moreread less

Journal Article•DOI•

Parameter Selection of a Support Vector Machine, Based on a Chaotic Particle Swarm Optimization Algorithm

[...]

Huang Dong, Gao Jian

01 Sep 2015-Cybernetics and Information Technologies

TL;DR: The simulation results show that the improved CPSO can find more easily the global optimum and reduce the number of iterations, which also makes the search for a group of optimal parameters of SVM quicker and more efficient.

...read moreread less

Abstract: This paper proposes a SVM Support Vector Machine parameter selection based on CPSO Chaotic Particle Swarm Optimization, in order to determine the optimal parameters of the support vector machine quickly and efficiently. SVMs are new methods being developed, based on statistical learning theory. Training a SVM can be formulated as a quadratic programming problem. The parameter selection of SVMs must be done before solving the QP Quadratic Programming problem. The PSO Particle Swarm Optimization algorithm is applied in the course of SVM parameter selection. Due to the sensitivity and frequency of the initial value of the chaotic motion, the PSO algorithm is also applied to improve the particle swarm optimization, so as to improve the global search ability of the particles. The simulation results show that the improved CPSO can find more easily the global optimum and reduce the number of iterations, which also makes the search for a group of optimal parameters of SVM quicker and more efficient.

...read moreread less

Journal Article•DOI•

Learning Resource-Aware Classifiers for Mobile Devices: From Regularization to Energy Efficiency

[...]

Luca Oneto, Alessandro Ghio, Sandro Ridella, Davide Anguita

02 Dec 2015-Neurocomputing

TL;DR: This work proposes a method to design a Human Activity Recognition algorithm, which takes into account the fact that only limited resources are available for its execution, and restricts the hypothesis space of possible recognition models by applying some advanced concepts from Statistical Learning Theory.

...read moreread less

Journal Article•DOI•

Support Vector Machine Ensemble Based on Choquet Integral for Financial Distress Prediction

[...]

Xihua Li¹, Fuqiang Wang¹, Xiaohong Chen¹•Institutions (1)

Central South University¹

20 May 2015-International Journal of Pattern Recognition and Artificial Intelligence

TL;DR: Empirical results indicate that the proposed ensemble of SVMs based on the Choquet integral for financial distress prediction has higher average accuracy and stability than single SVM classifiers.

...read moreread less

Abstract: Due to the radical change in both Chinese and global economic environment, it is essential to develop a practical model to predict financial distress. The support vector machine (SVM), a new outstanding learning machine based on the statistical learning theory, embodying the principle of structural risk minimization instead of empirical risk minimization principle, is a promising method for such financial distress prediction. However, to some extent, the performance of single classifier depends on the sample's pattern characteristics and each single classifier has its own uncertainty. Using the ensemble methods to predict financial distress becomes a rising trend in this field. This research puts forward a SVM ensemble based on the Choquet integral for financial distress prediction in which Bagging algorithm is used to generate new training sets. The proposed ensemble method can be expressed as "Choquet + Bagging + SVMs". With real data from Chinese listed companies, an experiment is carried out to compare the performance of single classifiers with the proposed ensemble method. Empirical results indicate that the proposed ensemble of SVMs based on the Choquet integral for financial distress prediction has higher average accuracy and stability than single SVM classifiers.

...read moreread less

Journal Article•DOI•

An oscillation bound of the generalization performance of extreme learning machine and corresponding analysis

[...]

Di Wang¹, Ping Wang¹, Yan Ji•Institutions (1)

Tianjin University¹

05 Mar 2015-Neurocomputing

TL;DR: It turns out that the oscillation bound is consistent with the experimental results about ELM obtained before and predicts that overfitting can be avoided even when the number of hidden nodes approaches infinity.

...read moreread less

Journal Article•DOI•

Models of dataset size, question design, and cross-language speech perception for speech crowdsourcing applications

[...]

Mark Hasegawa-Johnson¹, Jennifer Cole¹, Preethi Jyothi¹, Lav R. Varshney¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

01 Oct 2015-Laboratory Phonology

TL;DR: The method of mismatched crowdsourcing is introduced, in which workers transcribe a language they do not understand, and an explicit mathematical model of second-language phoneme perception is used to learn and then compensate their transcription errors.

...read moreread less

Abstract: Abstract Transcribers make mistakes. Workers recruited in a crowdsourcing marketplace, because of their varying levels of commitment and education, make more mistakes than workers in a controlled laboratory setting. Methods for compensating transcriber mistakes are desirable because, with such methods available, crowdsourcing has the potential to significantly increase the scale of experiments in laboratory phonology. This paper provides a brief tutorial on statistical learning theory, introducing the relationship between dataset size and estimation error, then presents a theoretical description and preliminary results for two new methods that control labeler error in laboratory phonology experiments. First, we discuss the method of crowdsourcing over error-correcting codes. In the error-correcting-code method, each difficult labeling task is first factored, by the experimenter, into the product of several easy labeling tasks (typically binary). Factoring increases the total number of tasks, nevertheless it results in faster completion and higher accuracy, because workers unable to perform the difficult task may be able to meaningfully contribute to the solution of each easy task. Second, we discuss the use of explicit mathematical models of the errors made by a worker in the crowd. In particular, we introduce the method of mismatched crowdsourcing, in which workers transcribe a language they do not understand, and an explicit mathematical model of second-language phoneme perception is used to learn and then compensate their transcription errors. Though introduced as technologies that increase the scale of phonology experiments, both methods have implications beyond increased scale. The method of easy questions permits us to probe the perception, by untrained listeners, of complicated phonological models; examples are provided from the prosody of English and Hindi. The method of mismatched crowdsourcing permits us to probe, in more detail than ever before, the perception of phonetic categories by listeners with a different phonological system.

...read moreread less

Journal Article•

A compression technique for analyzing disagreement-based active learning

[...]

Yair Wiener¹, Steve Hanneke, Ran El-Yaniv¹•Institutions (1)

Technion – Israel Institute of Technology¹

01 Jan 2015-Journal of Machine Learning Research

TL;DR: In this paper, a new and improved characterization of the label complexity of disagreement-based active learning is introduced, in which the leading quantity is the version space compression set size, defined as the smallest subset of the training data that induces the same version space.

...read moreread less

Abstract: We introduce a new and improved characterization of the label complexity of disagreement-based active learning, in which the leading quantity is the version space compression set size. This quantity is defined as the size of the smallest subset of the training data that induces the same version space. We show various applications of the new characterization, including a tight analysis of CAL and refined label complexity bounds for linear separators under mixtures of Gaussians and axis-aligned rectangles under product densities. The version space compression set size, as well as the new characterization of the label complexity, can be naturally extended to agnostic learning problems, for which we show new speedup results for two well known active learning algorithms.

...read moreread less

Journal Article•DOI•

An Image Classification Algorithm Based on SVM

[...]

Chun Hua Qian¹, He Qun Qiang¹, Sheng Rong Gong²•Institutions (2)

Suzhou Polytechnic Institute of Agriculture¹, University of Texas at Dallas²

01 Mar 2015-Applied Mechanics and Materials

TL;DR: An image classification algorithm based on Support Vector Machine is designed and the experimetal results demonstrate that the classification accuracy rate of the algorithm beyond 95%.

...read moreread less

Abstract: Image classification is a image processing method which to distinguish between different categories of objectives according to the different features of images. It is widely used in pattern recognition and computer vision. Support Vector Machine (SVM) is a new machine learning method base on statistical learning theory, it has a rigorous mathematical foundation, builts on the structural risk minimization criterion. We design an image classification algorithm based on SVM in this paper, use Gabor wavelet transformation to extract the image feature, use Principal Component Analysis (PCA) to reduce the dimension of feature matrix. We use orange images and LIBSVM software package in our experiments, select RBF as kernel function. The experimetal results demonstrate that the classification accuracy rate of our algorithm beyond 95%.

...read moreread less

Journal Article•DOI•

Recursive identification of multivariable ARX models in the presence of a priori information

[...]

Vojislav Filipovic¹•Institutions (1)

University of Kragujevac¹

01 Nov 2015-Signal Processing

TL;DR: The considered recursive algorithm is robust with respect to the uncertainty of statistical characteristics of disturbances and has the increased convergence speed in initial iterations, formulated theorem for convergence of estimated parameters with probability one.

...read moreread less