Showing papers on "Empirical risk minimization published in 2004"

PDF

Open Access

Book Chapter•DOI•

Introduction to Statistical Learning Theory

[...]

Olivier Bousquet¹, Stéphane Boucheron², Gábor Lugosi³•Institutions (3)

Max Planck Society¹, University of Paris², Pompeu Fabra University³

01 Sep 2004-Lecture Notes in Computer Science

TL;DR: This tutorial introduces the techniques that are used to obtain results in the form of so-called error bounds in statistical learning theory.

...read moreread less

Abstract: The goal of statistical learning theory is to study, in a statistical framework, the properties of learning algorithms. In particular, most results take the form of so-called error bounds. This tutorial introduces the techniques that are used to obtain such results.

...read moreread less

602 citations

Book•DOI•

Statistical learning theory and stochastic optimization

[...]

Olivier Catoni, Jean Picard

01 Jan 2004

TL;DR: Stochastic optimization and statistical learning theory and stochastic optimization, کتابخانه دیجیتال جندی شاپور اهواز

...read moreread less

Abstract: Statistical learning theory and stochastic optimization , Statistical learning theory and stochastic optimization , کتابخانه دیجیتال جندی شاپور اهواز

...read moreread less

326 citations

Journal Article•DOI•

General conditions for predictivity in learning theory

[...]

Tomaso Poggio¹, Ryan Rifkin¹, Ryan Rifkin², Sayan Mukherjee³, Sayan Mukherjee¹, Partha Niyogi⁴ - Show less +2 more•Institutions (4)

McGovern Institute for Brain Research¹, Honda², Massachusetts Institute of Technology³, University of Chicago⁴

25 Mar 2004-Nature

TL;DR: Conditions for generalization in terms of a precise stability property of the learning process are provided: when the training set is perturbed by deleting one example, the learned hypothesis does not change much.

...read moreread less

Abstract: Developing theoretical foundations for learning is a key step towards understanding intelligence. 'Learning from examples' is a paradigm in which systems (natural or artificial) learn a functional relationship from a training set of examples. Within this paradigm, a learning algorithm is a map from the space of training sets to the hypothesis space of possible functional solutions. A central question for the theory is to determine conditions under which a learning algorithm will generalize from its finite training set to novel examples. A milestone in learning theory was a characterization of conditions on the hypothesis space that ensure generalization for the natural class of empirical risk minimization (ERM) learning algorithms that are based on minimizing the error on the training set. Here we provide conditions for generalization in terms of a precise stability property of the learning process: when the training set is perturbed by deleting one example, the learned hypothesis does not change much. This stability property stipulates conditions on the learning map rather than on the hypothesis space, subsumes the classical theory for ERM algorithms, and is applicable to more general algorithms. The surprising connection between stability and predictivity has implications for the foundations of learning theory and for the design of novel algorithms, and provides insights into problems as diverse as language learning and inverse problems in physics and engineering.

...read moreread less

307 citations

Journal Article•

Statistical Analysis of Some Multi-Category Large Margin Classification Methods

[...]

Tong Zhang¹•Institutions (1)

IBM¹

01 Dec 2004-Journal of Machine Learning Research

TL;DR: It is shown that some risk minimization formulations can also be used to obtain conditional probability estimates for the underlying problem, which can be useful for statistical inferencing tasks beyond classification.

...read moreread less

Abstract: The purpose of this paper is to investigate statistical properties of risk minimization based multi-category classification methods. These methods can be considered as natural extensions of binary large margin classification. We establish conditions that guarantee the consistency of classifiers obtained in the risk minimization framework with respect to the classification error. Examples are provided for four specific forms of the general formulation, which extend a number of known methods. Using these examples, we show that some risk minimization formulations can also be used to obtain conditional probability estimates for the underlying problem. Such conditional probability information can be useful for statistical inferencing tasks beyond classification.

...read moreread less

283 citations

Journal Article•DOI•

On Robustness Properties of Convex Risk Minimization Methods for Pattern Recognition

[...]

Andreas Christmann¹, Ingo Steinwart²•Institutions (2)

Vrije Universiteit Brussel¹, Los Alamos National Laboratory²

01 Dec 2004-Journal of Machine Learning Research

TL;DR: The paper argues that robustness is an important aspect and shows that many existing machine learning methods based on the convex risk minimization principle have - besides other good properties - also the advantage of being robust.

...read moreread less

Abstract: The paper brings together methods from two disciplines: machine learning theory and robust statistics. We argue that robustness is an important aspect and we show that many existing machine learning methods based on the convex risk minimization principle have - besides other good properties - also the advantage of being robust. Robustness properties of machine learning methods based on convex risk minimization are investigated for the problem of pattern recognition. Assumptions are given for the existence of the influence function of the classifiers and for bounds on the influence function. Kernel logistic regression, support vector machines, least squares and the AdaBoost loss function are treated as special cases. Some results on the robustness of such methods are also obtained for the sensitivity curve and the maxbias, which are two other robustness criteria. A sensitivity analysis of the support vector machine is given.

...read moreread less

108 citations

Journal Article•DOI•

Deterministic design for neural network learning: an approach based on discrepancy

[...]

C. Cervellera, Marco Muselli

01 May 2004-IEEE Transactions on Neural Networks

TL;DR: An extension to the noisy case is provided, which shows that the good properties of deterministic learning are preserved, if the level of noise at the output is not high, when the points in the input space are generated by employing a purely deterministic algorithm.

...read moreread less

Abstract: The general problem of reconstructing an unknown function from a finite collection of samples is considered, in case the position of each input vector in the training set is not fixed beforehand but is part of the learning process. In particular, the consistency of the empirical risk minimization (ERM) principle is analyzed, when the points in the input space are generated by employing a purely deterministic algorithm (deterministic learning). When the output generation is not subject to noise, classical number-theoretic results, involving discrepancy and variation, enable the establishment of a sufficient condition for the consistency of the ERM principle. In addition, the adoption of low-discrepancy sequences enables the achievement of a learning rate of O(1/L), with L being the size of the training set. An extension to the noisy case is provided, which shows that the good properties of deterministic learning are preserved, if the level of noise at the output is not high. Simulation results confirm the validity of the proposed approach.

...read moreread less

56 citations

Journal Article•DOI•

Information Theory, Inference, and Learning Algorithms

[...]

Alex M. Andrew

01 Aug 2004-Kybernetes

48 citations

Journal Article•DOI•

Fast scaling algorithms for M-convex function minimization with application to the resource allocation problem

[...]

Akiyoshi Shioura¹•Institutions (1)

Tohoku University¹

05 Jan 2004-Discrete Applied Mathematics

TL;DR: This paper proposes two new polynomial-time scaling algorithms for the minimization of an M-convex function, both of which run as fast as the previous minimization algorithms.

...read moreread less

42 citations

Journal Article•DOI•

An empirical risk functional to improve learning in a neuro-fuzzy classifier

[...]

Giovanna Castellano¹, Anna Maria Fanelli¹, Corrado Mencar¹•Institutions (1)

University of Bari¹

01 Feb 2004

TL;DR: A new Empirical Risk Functional as cost function for training neuro-fuzzy classifiers that provides a differentiable approximation of the misclassification rate so that the Empiricals Risk Minimization Principle formulated in Vapnik's Statistical Learning Theory can be applied.

...read moreread less

Abstract: The paper proposes a new Empirical Risk Functional as cost function for training neuro-fuzzy classifiers. This cost function, called Approximate Differentiable Empirical Risk Functional (ADERF), provides a differentiable approximation of the misclassification rate so that the Empirical Risk Minimization Principle formulated in Vapnik's Statistical Learning Theory can be applied. Also, based on the proposed ADERF, a learning algorithm is formulated. Experimental results on a number of benchmark classification tasks are provided and comparison to alternative approaches given.

...read moreread less

38 citations

Report•DOI•

Statistical Learning: Stability is Sufficient for Generalization and Necessary and Sufficient for Consistency of Empirical Risk Minimization

[...]

Sayan Mukherjee, Partha Niyogi¹, Tomaso Poggio, Ryan Rifkin•Institutions (1)

Massachusetts Institute of Technology¹

01 Jan 2004

TL;DR: It is proved that for ERM a certain form of well-posedness is equivalent to consistency, and that for bounded loss classes CVEEE(loo) stability is a weak form of stability that represents a sufficient condition for generalization for general learning algorithms while subsuming the classical conditions for consistency of ERM.

...read moreread less

Abstract: : Solutions of learning problems by Empirical Risk Minimization (ERM) -- and almost-ERM when the minimizer does not exist -- need to be consistent, so that they may be predictive. They also need to be well-posed in the sense of being stable, so that they might be used robustly. We propose a statistical form of leave-one-out stability, called CVEEE(loo) stability. Our main new results are two. We prove that for bounded loss classes CVEEE(loo) stability is (a) sufficient for generalization, that is convergence in probability of the empirical error to the expected error, for any algorithm satisfying it and, (b) necessary and sufficient for generalization and consistency of ERM. Thus CVEEE(loo) stability is a weak form of stability that represents a sufficient condition for generalization for general learning algorithms while subsuming the classical conditions for consistency of ERM. We discuss alternative forms of stability. In particular, we conclude that for ERM a certain form of well-posedness is equivalent to consistency.

...read moreread less

34 citations

Book Chapter•DOI•

A Generic Sensor Fusion Problem: Classification and Function Estimation

[...]

Nageswara S. V. Rao¹•Institutions (1)

Oak Ridge National Laboratory¹

09 Jun 2004

TL;DR: A generic fusion problem is studied for multiple sensors whose outputs are probabilistically related to their inputs according to unknown distributions, and an empirical risk minimization method is described for designing fusers with distribution-free performance bounds.

...read moreread less

Abstract: A generic fusion problem is studied for multiple sensors whose outputs are probabilistically related to their inputs according to unknown distributions. Sensor measurements are provided as iid input-output samples, and an empirical risk minimization method is described for designing fusers with distribution-free performance bounds. The special cases of isolation and projective fusers for classifiers and function estimators, respectively, are described in terms of performance bounds. The isolation fusers for classifiers are probabilistically guaranteed to perform at least as good as the best classifier. The projective fusers for function estimators are probabilistically guaranteed to perform at least as good as the best subset of estimators.

...read moreread less

Proceedings Article•DOI•

Target location estimation in sensor networks using range information

[...]

Antonio Artés-Rodríguez¹, Marcelino Lázaro¹, L. Tong•Institutions (1)

Carlos III Health Institute¹

18 Jul 2004

TL;DR: In this paper, the authors proposed two different fusion rules for estimating the target location: a maximum likelihood estimate and an empirical risk minimization method, based on a binary (detection vs no detection) information from each sensor and the model of p/sub d/, and compared them in terms of complexity and accuracy.

...read moreread less

Abstract: We consider the problem of target location estimation in the context of large scale, dense sensor networks. We model the probability of detection in each sensor, p/sub d/ as a function of the distance between the sensor and the target. Based on a binary (detection vs. no detection) information from each sensor and the model of p/sub d/, we propose two different fusion rules for estimating the target location: a maximum likelihood estimate and an empirical risk minimization method. Moreover, we also consider the case where only sensors with a positive detection transmit their reading. This can be helpful to economize the power of sensor units. By employing Gaussian like p/sub d/ models, we develop versions of both methods based on simple initialization procedures and a gradient search. We compare and discuss both algorithms in terms of complexity and accuracy.

...read moreread less

Proceedings Article•DOI•

Decentralized detection and classification using kernel methods

[...]

XuanLong Nguyen¹, Martin J. Wainwright¹, Michael I. Jordan¹•Institutions (1)

University of California, Berkeley¹

04 Jul 2004

TL;DR: This work proposes a novel algorithm using the framework of empirical risk minimization and marginalized kernels, and analyzes its computational and statistical properties both theoretically and empirically.

...read moreread less

Abstract: We consider the problem of decentralized detection under constraints on the number of bits that can be transmitted by each sensor. In contrast to most previous work, in which the joint distribution of sensor observations is assumed to be known, we address the problem when only a set of empirical samples is available. We propose a novel algorithm using the framework of empirical risk minimization and marginalized kernels, and analyze its computational and statistical properties both theoretically and empirically. We provide an efficient implementation of the algorithm, and demonstrate its performance on both simulated and real data sets.

...read moreread less

Proceedings Article•DOI•

Research on natural gas load forecasting based on least squares support vector machine

[...]

Han Liu, Ding Liu, Yanming Liang, Gang Zheng

26 Aug 2004

TL;DR: A novel regression technique, called support vector machines (SVM), based on the statistical learning theory is explored in this paper for the prediction of natural gas demands, showing that the prediction accuracy of SVM is better than that of neural network.

...read moreread less

Abstract: Natural gas load forecasting is a key process to the efficient operation of pipeline network. An accurate forecast is required to guarantee a balanced network operation and ensure safe gas supply at a minimal cost. Machine learning techniques are finding more and more applications in the field of load forecasting. A novel regression technique, called support vector machines (SVM), based on the statistical learning theory is explored in this paper for the prediction of natural gas demands. SVM is based on the principle of structure risk minimization as opposed to the principle of empirical risk minimization supported by the conventional regression techniques. Least squares support vector machines (LS-SVM) is a kind of SVM that has different cost function with respect to standard SVM. The research result shows that the prediction accuracy of SVM is better than that of neural network. The software package NGPSLF based on LS-SVM prediction has been gone into practical business application.

...read moreread less

Learning models and formulas of a temporal event logic

[...]

Alan Fern¹, Robert Givan¹•Institutions (1)

Purdue University¹

01 Jan 2004

TL;DR: A supervised learning algorithm for automatically acquiring high-level visual event definitions from low-level force-dynamic interpretations of video—relieving the user of the need to hand code definitions, and empirical results compare favorably to pre-existing hand-coded model reconstructors.

...read moreread less

Abstract: We study novel learning and inference algorithms for temporal, relational data and their application to trainable video interpretation. With these algorithms we extend an existing visual-event recognition system, Leonard (Siskind 2001), in two directions. First, we develop, analyze, and evaluate a supervised learning algorithm for automatically acquiring high-level visual event definitions from low-level force-dynamic interpretations of video—relieving the user of the need to hand code definitions. We introduce a simple temporal event-description logic called AMA and give algorithms and complexity bounds for the AMA subsumption and generalization problems. A learning method is developed based on these algorithms and applied to the task of learning relational event definitions from video. Experiments show that the learned definitions are competitive with hand-coded ones. Second, we study the problem of relational sequential inference with application to inferring force-dynamic models from video data for use in event learning and recognition. We introduce two frameworks for this problem that provide different approaches to leveraging “nearly sound” logical constraints on a process. We study learning and inference in both frameworks and our empirical results compare favorably to pre-existing hand-coded model reconstructors.

...read moreread less

Research on Natural Gas Short-Term Load Forecasting Based on Support Vector Regression

[...]

Liang Yanming

01 Jan 2004

TL;DR: A novel regression technique based on the statistical learning theory, support vector machines (SVM), is investigated in this paper for natural gas short-term load forecasting and results show that SVM provides better prediction accuracy than neural network.

...read moreread less

Abstract: Natural gas load forecasting is a key process to the efficient operation of pipeline network. An accurate forecast is required to guarantee a balanced network operation and ensure safe gas supply at a minimum cost. Machine learning techniques have been increasingly applied to load forecasting. A novel regression technique based on the statistical learning theory, support vector machines (SVM), is investigated in this paper for natural gas short-term load forecasting. SVM is based on the principle of structure risk minimization as opposed to the principle of empirical risk minimization in conventional regression techniques. Using a data set with 2 years load values we developed prediction model using SVM to obtain 31 days load predictions. The results on city natural gas short-term load forecasting show that SVM provides better prediction accuracy than neural network. The software package natural gas pipeline networks simulation and load forecasting (NGPNSLF) based on support vector regression prediction has been developed, which has also been applied in practice.

...read moreread less

Journal Article•

Natural gas load forecasting based on least squares support vector machine

[...]

Liang Yanming

01 Jan 2004-Journal of Chemical Industry and Engineering

TL;DR: A novel regression technique, called support vector machine (SVM), based on the statistical learning theory is applied in this paper for the prediction of natural gas demands and shows that the prediction accuracy of SVM is better than that of neural network.

...read moreread less

Abstract: Machine learning techniques are finding more and more applications in the field of load forecasting A novel regression technique,called support vector machine (SVM),based on the statistical learning theory is applied in this paper for the prediction of natural gas demands Least squares support vector machine (LS-SVM) is a kind of SVM that has different cost function with respect to SVM SVM is based on the principle of structure risk minimization as opposed to the principle of empirical risk minimization supported by conventional regression techniques The prediction result shows that the prediction accuracy of SVM is better than that of neural network Thus,SVM appears to be a very promising prediction tool The software package NGPSLF based on SVM prediction has been put into practical business application

...read moreread less

Proceedings Article•DOI•

Research on natural gas load forecasting based on support vector regression

[...]

Han Liu, Ding Liu, Gang Zheng, Yanming Liang, Yunfeng Ni - Show less +1 more

15 Jun 2004

TL;DR: A novel regression technique, called Support Vector Machines (SVM), based on the statistical learning theory is explored and the prediction result shows that prediction accuracy of SVM is better than that of neural network.

...read moreread less

Abstract: Machine learning techniques are finding more and more applications in the field of load forecasting. In this paper a novel regression technique, called Support Vector Machines (SVM), based on the statistical learning theory is explored. SVM is hased on the principle of Structure Risk Minimization as opposed to the principle of Empirical Risk Minimization supported by conventional regression techniques. The natural gas load data in Xi'an city in 2001 and 2002 are used in this study to demonstrate the forecasting capabilities of SVM. The result is compared with that of neural network based model for 7-lead day forecusting. The prediction result shows that prediction accuracy of SVM is better than that of neural network. Thus, SVM appears to he a very promising prediction tool. The software package NGPSLF based on support vector regression (SVR) also has been gone into practical business application.

...read moreread less

Proceedings Article•DOI•

The bounds on the rate of convergence of learning process about fuzzy examples

[...]

Ming-Hu Ha¹, Jing Tian¹, Jun-Hua Liu¹, Xizhao Wang¹•Institutions (1)

Hebei University¹

26 Aug 2004

TL;DR: In this paper, the authors discuss the bounds on the risk for loss function about fuzzy examples and then estimate the rate of convergence of statistical learning theory with respect to fuzzy examples, which is a recently developed new theory for pattern recognition.

...read moreread less

Abstract: Statistical learning theory, a recently developed new theory for pattern recognition, is a small sample statistics proposed by Vapnik et al, which deals mainly with the statistical principles when the samples are limited. The bounds on the rate of convergence play an important role in the statistical learning theory. We discuss the bounds on the risk for loss function about fuzzy examples and then estimate the rate of convergence.

...read moreread less

Journal Article•DOI•

Advantages of Unbiased Support Vector Classifiers for Data Mining Applications

[...]

A. Navia-Vazquez¹, Fernando Perez-Cruz¹, Antonio Artés-Rodríguez¹, Aníbal R. Figueiras-Vidal¹•Institutions (1)

Carlos III Health Institute¹

01 Jun 2004

TL;DR: An unbiased implementation of SVC is proposed by introducing a more appropriate “error counting” term, which means the number of classification errors is truly minimized, while the maximal margin solution is obtained in the separable case.

...read moreread less

Abstract: Many learning algorithms have been used for data mining applications, including Support Vector Classifiers (SVC), which have shown improved capabilities with respect to other approaches, since they provide a natural mechanism for implementing Structural Risk Minimization (SRM), obtaining machines with good generalization properties. SVC leads to the optimal hyperplane (maximal margin) criterion for separable datasets but, in the nonseparable case, the SVC minimizes the L1 norm of the training errors plus a regularizing term, to control the machine complexity. The L1 norm is chosen because it allows to solve the minimization with a Quadratic Programming (QP) scheme, as in the separable case. But the L1 norm is not truly an "error counting" term as the Empirical Risk Minimization (ERM) inductive principle indicates, leading therefore to a biased solution. This effect is specially severe in low complexity machines, such as linear classifiers or machines with few nodes (neurons, kernels, basis functions). Since one of the main goals in data mining is that of explanation, these reduced architectures are of great interest because they represent the origins of other techniques such as input selection or rule extraction. Training SVMs as accurately as possible in these situations (i.e., without this bias) is, therefore, an interesting goal. We propose here an unbiased implementation of SVC by introducing a more appropriate "error counting" term. This way, the number of classification errors is truly minimized, while the maximal margin solution is obtained in the separable case. QP can no longer be used for solving the new minimization problem, and we apply instead an iterated Weighted Least Squares (WLS) procedure. This modification in the cost function of the Support Vector Machine to solve ERM was not possible up to date given the Quadratic or Linear Programming techniques commonly used, but it is now possible using the iterated WLS formulation. Computer experiments show that the proposed method is superior to the classical approach in the sense that it truly solves the ERM problem.

...read moreread less

Principles of SVM and Its Application in Micro-calcifications Detection in Mammogram

[...]

QI Hong-zhi

01 Jan 2004

TL;DR: The algorithm is tested with mammograms of clinical patients and results show that SVM method achieves a higher true positive in comparison with artificial neural network based on the empirical risk minimization, and is valuable for application in clinical engineering.

...read moreread less

Abstract: Support vector machine (SVM) is a new statistical learning method. Compared with the classical machine learning methods, the learning discipline of SVM is to minimize the structural risk instead of empirical risk used in the learning discipline of classical methods,and SVM gives better generative performance. Because SVM algorithm is a convex quadratic optimization problem, the local optimal solution is certainly the global optimal one.In this paper,SVM algorithm is applied to detect the micro-calcifications in mammogram for the first time. The algorithm is tested with mammograms of clinical patients and results show that SVM method achieves a higher true positive in comparison with artificial neural network (ANN) based on the empirical risk minimization,and is valuable for application in clinical engineering.

...read moreread less

Journal Article•DOI•

2nd-order PD-type Learning Control Algorithm

[...]

Yong-Tae Kim, Zeungnam Bien

01 Apr 2004-Journal of The Korean Institute of Intelligent Systems

TL;DR: It is shown that the proposed 2nd-order PD-type iterative learning control algorithms for linear continuous-time system and linear discrete- time system has robustness in the presence of external disturbances and the convergence accuracy can be improved.

...read moreread less

Abstract: In this paper are proposed 2nd-order PD-type iterative learning control algorithms for linear continuous-time system and linear discrete-time system In contrast to conventional methods, the proposed learning algorithms are constructed based on both time-domain performance and iteration-domain performance The convergence of the proposed learning algorithms is proved Also, it is shown that the proposed method has robustness in the presence of external disturbances and the convergence accuracy can be improved A numerical example is provided to show the effectiveness of the proposed algorithms

...read moreread less

Journal Article•

Combinatorial Bounds for Learning Performance

[...]

Konstantin Vorontsov

01 Jan 2004-Doklady Mathematics

TL;DR: Cross-validation functionals and their upper bounds are considered and new performance bounds for monotone classifiers are obtained, which are nontrivial for small data sets and do not depend on the family complexity.

...read moreread less

Abstract: Cross-validation functionals and their upper bounds are considered that characterize the generalization performance of learning algorithms. The initial data are not assumed to be independent, identically distributed (i.i.d.) or even to be random. The effect of localization of an algorithm family is described, and the concept of a local growth function is introduced. New performance bounds for monotone classifiers are obtained, which are nontrivial for small data sets and do not depend on the family complexity. The learning problem can be described as follows. We are given an object space X , an output space Y , and

...read moreread less

Proceedings Article•DOI•

Multiple instance learning using simple classifiers

[...]

Adam Cannon¹, Don Hush²•Institutions (2)

Columbia University¹, Los Alamos National Laboratory²

16 Dec 2004

TL;DR: The utility of an empirical risk minimization approach allowing for a straightforward classification treatment of the problem is demonstrated, and the empirical results are competitive with those of the most successful previously published methods.

...read moreread less

Abstract: In this paper we study Multiple Instance Learning, a variant of the standard classification problem. We demonstrate the utility of an empirical risk minimization approach allowing for a straightforward classification treatment of the problem. In addition we consider simple data dependent hypothesis classes that allow efficient minimization of the empirical loss function and the development of bounds on the estimation error. Our empirical results are competitive with those of the most successful previously published methods.

...read moreread less

Proceedings Article•DOI•

2-D learning dynamics analysis of iterative learning control for continuous-time system with variable initial condition

[...]

Fang Yong¹•Institutions (1)

Shanghai University¹

15 Jun 2004

TL;DR: A continuous-discrete two-dimensional learning system is established and the proposed error system describes entire dynamics involved in iterative learning including the dynamics between tracking error and variable initial state.

...read moreread less

Abstract: In this paper, iterative learning control problem of continuous-time system with variable initial condition is considered. Based on the two-dimensionality of the overall learning system, a continuous-discrete two-dimensional (2D) learning system is established. The proposed error system describes entire dynamics involved in iterative learning including the dynamics between tracking error and variable initial state. According to the 2D system theory, a learning error estimate is given and the design method for learning matrices is also proposed. The simulation results demonstrate the performance of our proposed method.

...read moreread less

Proceedings Article•DOI•

The key theorem of learning theory about examples corrupted by noise

[...]

Ming-Hu Ha¹, Jun-Hua Li¹, Jia Li¹, Xizhao Wang¹•Institutions (1)

Hebei University¹

26 Aug 2004

TL;DR: This work gives the key theorem when the outputs are corrupted by noise, i.e. noise-free case, and investigates the conditions for consistency of the learning processes based on the empirical risk minimization induction principle.

...read moreread less

Abstract: Statistical learning theory has investigated the conditions for consistency of the learning processes based on the empirical risk minimization induction principle. However, it deals with the unrealistic, i.e. noise-free case. We give the key theorem when the outputs are corrupted by noise.

...read moreread less

Journal Article•DOI•

On a Method of Empirical Risk Minimization

[...]

G. K. Golubev¹•Institutions (1)

University of Provence¹

01 Jul 2004-Problems of Information Transmission

TL;DR: The paper studies a method for choosing a projection estimator, based on the principle of penalized empirical risk minimization, for estimating an unknown vector observed in a simple white Gaussian noise model.

...read moreread less

Abstract: We consider the problem of estimating an unknown vector observed in a simple white Gaussian noise model For the estimation, a family of projection estimators is used; the problem is to choose, based on observations, the best estimator within this family The paper studies a method for choosing a projection estimator, based on the principle of penalized empirical risk minimization For this estimation method, nonasymptotic inequalities controlling its quadratic risk are given

...read moreread less

Journal Article•

Learning Effect Evaluation System Based on Support Vector Machine

[...]

Shen Rui-min¹•Institutions (1)

Shanghai Jiao Tong University¹

01 Jan 2004-Computer Engineering

TL;DR: The simulation result shows that the learning evaluation system can give a good evaluation for students' learning, furthermore, it provides teachers a powerful tool for teaching.

...read moreread less

Abstract: Based on the analysis of learning behavior of students, this paper puts forward a learning evaluation system that uses the support vector machine algorithm.The simulation result shows that the system can give a good evaluation for students' learning, furthermore, it provides teachers a powerful tool for teaching.

...read moreread less

Book•

Learning Theory: 17th Annual Conference on Learning Theory, COLT 2004, Banff, Canada, July 1-4, 2004, Proceedings

[...]

John Shawe-Taylor, Yoram Singer

17 Jun 2004

TL;DR: This chapter discusses the Budgeted Multi-armed Bandit Problem, Reinforcement Learning for Average Reward Zero-Sum Games, and more.

...read moreread less

Abstract: Economics and Game Theory.- Towards a Characterization of Polynomial Preference Elicitation with Value Queries in Combinatorial Auctions.- Graphical Economics.- Deterministic Calibration and Nash Equilibrium.- Reinforcement Learning for Average Reward Zero-Sum Games.- OnLine Learning.- Polynomial Time Prediction Strategy with Almost Optimal Mistake Probability.- Minimizing Regret with Label Efficient Prediction.- Regret Bounds for Hierarchical Classification with Linear-Threshold Functions.- Online Geometric Optimization in the Bandit Setting Against an Adaptive Adversary.- Inductive Inference.- Learning Classes of Probabilistic Automata.- On the Learnability of E-pattern Languages over Small Alphabets.- Replacing Limit Learners with Equally Powerful One-Shot Query Learners.- Probabilistic Models.- Concentration Bounds for Unigrams Language Model.- Inferring Mixtures of Markov Chains.- Boolean Function Learning.- PExact = Exact Learning.- Learning a Hidden Graph Using O(log n) Queries Per Edge.- Toward Attribute Efficient Learning of Decision Lists and Parities.- Empirical Processes.- Learning Over Compact Metric Spaces.- A Function Representation for Learning in Banach Spaces.- Local Complexities for Empirical Risk Minimization.- Model Selection by Bootstrap Penalization for Classification.- MDL.- Convergence of Discrete MDL for Sequential Prediction.- On the Convergence of MDL Density Estimation.- Suboptimal Behavior of Bayes and MDL in Classification Under Misspecification.- Generalisation I.- Learning Intersections of Halfspaces with a Margin.- A General Convergence Theorem for the Decomposition Method.- Generalisation II.- Oracle Bounds and Exact Algorithm for Dyadic Classification Trees.- An Improved VC Dimension Bound for Sparse Polynomials.- A New PAC Bound for Intersection-Closed Concept Classes.- Clustering and Distributed Learning.- A Framework for Statistical Clustering with a Constant Time Approximation Algorithms for K-Median Clustering.- Data Dependent Risk Bounds for Hierarchical Mixture of Experts Classifiers.- Consistency in Models for Communication Constrained Distributed Learning.- On the Convergence of Spectral Clustering on Random Samples: The Normalized Case.- Boosting.- Performance Guarantees for Regularized Maximum Entropy Density Estimation.- Learning Monotonic Linear Functions.- Boosting Based on a Smooth Margin.- Kernels and Probabilities.- Bayesian Networks and Inner Product Spaces.- An Inequality for Nearly Log-Concave Distributions with Applications to Learning.- Bayes and Tukey Meet at the Center Point.- Sparseness Versus Estimating Conditional Probabilities: Some Asymptotic Results.- Kernels and Kernel Matrices.- A Statistical Mechanics Analysis of Gram Matrix Eigenvalue Spectra.- Statistical Properties of Kernel Principal Component Analysis.- Kernelizing Sorting, Permutation, and Alignment for Minimum Volume PCA.- Regularization and Semi-supervised Learning on Large Graphs.- Open Problems.- Perceptron-Like Performance for Intersections of Halfspaces.- The Optimal PAC Algorithm.- The Budgeted Multi-armed Bandit Problem.

...read moreread less

Journal Article•

Iterative learning control design for irregular linear system with an arbitrary initial state

[...]

Wang Yi-min

01 Jan 2004-Systems engineering and electronics

TL;DR: Two new-type initial state learning schemes are developed for irregular linear discrete-time system with non-zero initial error for dynamic systems and simulation results show that the proposed algorithms are very effective.

...read moreread less

Abstract: Iterative learning control techniques are widely applied to robotic systems. Most of the proposed algorithms can be only used when the initial learning error are zero. However, in practical engineering, the initial learning states will shift from the desired initial states. As a result, it is necessary to develop some new learning algorithms to deal with such situations. Two new-type initial state learning schemes are developed for irregular linear discrete-time system with non-zero initial error. Based on 2-D system theory, iterative learning control algorithms for arbitrary initial state are analyzed on 2-D notion. The proposed algorithms can be applied to practical situations without assumption of zero initial error for dynamic systems. The simulation results show that the proposed algorithms are very effective.

...read moreread less