scispace - formally typeset
Search or ask a question

Showing papers on "Empirical risk minimization published in 2004"


Book ChapterDOI
TL;DR: This tutorial introduces the techniques that are used to obtain results in the form of so-called error bounds in statistical learning theory.
Abstract: The goal of statistical learning theory is to study, in a statistical framework, the properties of learning algorithms. In particular, most results take the form of so-called error bounds. This tutorial introduces the techniques that are used to obtain such results.

602 citations


BookDOI
01 Jan 2004
TL;DR: Stochastic optimization and statistical learning theory and stochastic optimization, کتابخانه دیجیتال جندی شاپور اهواز
Abstract: Statistical learning theory and stochastic optimization , Statistical learning theory and stochastic optimization , کتابخانه دیجیتال جندی شاپور اهواز

326 citations


Journal ArticleDOI
25 Mar 2004-Nature
TL;DR: Conditions for generalization in terms of a precise stability property of the learning process are provided: when the training set is perturbed by deleting one example, the learned hypothesis does not change much.
Abstract: Developing theoretical foundations for learning is a key step towards understanding intelligence. 'Learning from examples' is a paradigm in which systems (natural or artificial) learn a functional relationship from a training set of examples. Within this paradigm, a learning algorithm is a map from the space of training sets to the hypothesis space of possible functional solutions. A central question for the theory is to determine conditions under which a learning algorithm will generalize from its finite training set to novel examples. A milestone in learning theory was a characterization of conditions on the hypothesis space that ensure generalization for the natural class of empirical risk minimization (ERM) learning algorithms that are based on minimizing the error on the training set. Here we provide conditions for generalization in terms of a precise stability property of the learning process: when the training set is perturbed by deleting one example, the learned hypothesis does not change much. This stability property stipulates conditions on the learning map rather than on the hypothesis space, subsumes the classical theory for ERM algorithms, and is applicable to more general algorithms. The surprising connection between stability and predictivity has implications for the foundations of learning theory and for the design of novel algorithms, and provides insights into problems as diverse as language learning and inverse problems in physics and engineering.

307 citations


Journal Article
Tong Zhang1
TL;DR: It is shown that some risk minimization formulations can also be used to obtain conditional probability estimates for the underlying problem, which can be useful for statistical inferencing tasks beyond classification.
Abstract: The purpose of this paper is to investigate statistical properties of risk minimization based multi-category classification methods. These methods can be considered as natural extensions of binary large margin classification. We establish conditions that guarantee the consistency of classifiers obtained in the risk minimization framework with respect to the classification error. Examples are provided for four specific forms of the general formulation, which extend a number of known methods. Using these examples, we show that some risk minimization formulations can also be used to obtain conditional probability estimates for the underlying problem. Such conditional probability information can be useful for statistical inferencing tasks beyond classification.

283 citations


Journal ArticleDOI
TL;DR: The paper argues that robustness is an important aspect and shows that many existing machine learning methods based on the convex risk minimization principle have - besides other good properties - also the advantage of being robust.
Abstract: The paper brings together methods from two disciplines: machine learning theory and robust statistics. We argue that robustness is an important aspect and we show that many existing machine learning methods based on the convex risk minimization principle have - besides other good properties - also the advantage of being robust. Robustness properties of machine learning methods based on convex risk minimization are investigated for the problem of pattern recognition. Assumptions are given for the existence of the influence function of the classifiers and for bounds on the influence function. Kernel logistic regression, support vector machines, least squares and the AdaBoost loss function are treated as special cases. Some results on the robustness of such methods are also obtained for the sensitivity curve and the maxbias, which are two other robustness criteria. A sensitivity analysis of the support vector machine is given.

108 citations


Journal ArticleDOI
TL;DR: An extension to the noisy case is provided, which shows that the good properties of deterministic learning are preserved, if the level of noise at the output is not high, when the points in the input space are generated by employing a purely deterministic algorithm.
Abstract: The general problem of reconstructing an unknown function from a finite collection of samples is considered, in case the position of each input vector in the training set is not fixed beforehand but is part of the learning process. In particular, the consistency of the empirical risk minimization (ERM) principle is analyzed, when the points in the input space are generated by employing a purely deterministic algorithm (deterministic learning). When the output generation is not subject to noise, classical number-theoretic results, involving discrepancy and variation, enable the establishment of a sufficient condition for the consistency of the ERM principle. In addition, the adoption of low-discrepancy sequences enables the achievement of a learning rate of O(1/L), with L being the size of the training set. An extension to the noisy case is provided, which shows that the good properties of deterministic learning are preserved, if the level of noise at the output is not high. Simulation results confirm the validity of the proposed approach.

56 citations



Journal ArticleDOI
TL;DR: This paper proposes two new polynomial-time scaling algorithms for the minimization of an M-convex function, both of which run as fast as the previous minimization algorithms.

42 citations


Journal ArticleDOI
01 Feb 2004
TL;DR: A new Empirical Risk Functional as cost function for training neuro-fuzzy classifiers that provides a differentiable approximation of the misclassification rate so that the Empiricals Risk Minimization Principle formulated in Vapnik's Statistical Learning Theory can be applied.
Abstract: The paper proposes a new Empirical Risk Functional as cost function for training neuro-fuzzy classifiers. This cost function, called Approximate Differentiable Empirical Risk Functional (ADERF), provides a differentiable approximation of the misclassification rate so that the Empirical Risk Minimization Principle formulated in Vapnik's Statistical Learning Theory can be applied. Also, based on the proposed ADERF, a learning algorithm is formulated. Experimental results on a number of benchmark classification tasks are provided and comparison to alternative approaches given.

38 citations


ReportDOI
01 Jan 2004
TL;DR: It is proved that for ERM a certain form of well-posedness is equivalent to consistency, and that for bounded loss classes CVEEE(loo) stability is a weak form of stability that represents a sufficient condition for generalization for general learning algorithms while subsuming the classical conditions for consistency of ERM.
Abstract: : Solutions of learning problems by Empirical Risk Minimization (ERM) -- and almost-ERM when the minimizer does not exist -- need to be consistent, so that they may be predictive. They also need to be well-posed in the sense of being stable, so that they might be used robustly. We propose a statistical form of leave-one-out stability, called CVEEE(loo) stability. Our main new results are two. We prove that for bounded loss classes CVEEE(loo) stability is (a) sufficient for generalization, that is convergence in probability of the empirical error to the expected error, for any algorithm satisfying it and, (b) necessary and sufficient for generalization and consistency of ERM. Thus CVEEE(loo) stability is a weak form of stability that represents a sufficient condition for generalization for general learning algorithms while subsuming the classical conditions for consistency of ERM. We discuss alternative forms of stability. In particular, we conclude that for ERM a certain form of well-posedness is equivalent to consistency.

34 citations


Book ChapterDOI
09 Jun 2004
TL;DR: A generic fusion problem is studied for multiple sensors whose outputs are probabilistically related to their inputs according to unknown distributions, and an empirical risk minimization method is described for designing fusers with distribution-free performance bounds.
Abstract: A generic fusion problem is studied for multiple sensors whose outputs are probabilistically related to their inputs according to unknown distributions. Sensor measurements are provided as iid input-output samples, and an empirical risk minimization method is described for designing fusers with distribution-free performance bounds. The special cases of isolation and projective fusers for classifiers and function estimators, respectively, are described in terms of performance bounds. The isolation fusers for classifiers are probabilistically guaranteed to perform at least as good as the best classifier. The projective fusers for function estimators are probabilistically guaranteed to perform at least as good as the best subset of estimators.

Proceedings ArticleDOI
18 Jul 2004
TL;DR: In this paper, the authors proposed two different fusion rules for estimating the target location: a maximum likelihood estimate and an empirical risk minimization method, based on a binary (detection vs no detection) information from each sensor and the model of p/sub d/, and compared them in terms of complexity and accuracy.
Abstract: We consider the problem of target location estimation in the context of large scale, dense sensor networks. We model the probability of detection in each sensor, p/sub d/ as a function of the distance between the sensor and the target. Based on a binary (detection vs. no detection) information from each sensor and the model of p/sub d/, we propose two different fusion rules for estimating the target location: a maximum likelihood estimate and an empirical risk minimization method. Moreover, we also consider the case where only sensors with a positive detection transmit their reading. This can be helpful to economize the power of sensor units. By employing Gaussian like p/sub d/ models, we develop versions of both methods based on simple initialization procedures and a gradient search. We compare and discuss both algorithms in terms of complexity and accuracy.

Proceedings ArticleDOI
04 Jul 2004
TL;DR: This work proposes a novel algorithm using the framework of empirical risk minimization and marginalized kernels, and analyzes its computational and statistical properties both theoretically and empirically.
Abstract: We consider the problem of decentralized detection under constraints on the number of bits that can be transmitted by each sensor. In contrast to most previous work, in which the joint distribution of sensor observations is assumed to be known, we address the problem when only a set of empirical samples is available. We propose a novel algorithm using the framework of empirical risk minimization and marginalized kernels, and analyze its computational and statistical properties both theoretically and empirically. We provide an efficient implementation of the algorithm, and demonstrate its performance on both simulated and real data sets.

Proceedings ArticleDOI
26 Aug 2004
TL;DR: A novel regression technique, called support vector machines (SVM), based on the statistical learning theory is explored in this paper for the prediction of natural gas demands, showing that the prediction accuracy of SVM is better than that of neural network.
Abstract: Natural gas load forecasting is a key process to the efficient operation of pipeline network. An accurate forecast is required to guarantee a balanced network operation and ensure safe gas supply at a minimal cost. Machine learning techniques are finding more and more applications in the field of load forecasting. A novel regression technique, called support vector machines (SVM), based on the statistical learning theory is explored in this paper for the prediction of natural gas demands. SVM is based on the principle of structure risk minimization as opposed to the principle of empirical risk minimization supported by the conventional regression techniques. Least squares support vector machines (LS-SVM) is a kind of SVM that has different cost function with respect to standard SVM. The research result shows that the prediction accuracy of SVM is better than that of neural network. The software package NGPSLF based on LS-SVM prediction has been gone into practical business application.

01 Jan 2004
TL;DR: A supervised learning algorithm for automatically acquiring high-level visual event definitions from low-level force-dynamic interpretations of video—relieving the user of the need to hand code definitions, and empirical results compare favorably to pre-existing hand-coded model reconstructors.
Abstract: We study novel learning and inference algorithms for temporal, relational data and their application to trainable video interpretation. With these algorithms we extend an existing visual-event recognition system, Leonard (Siskind 2001), in two directions. First, we develop, analyze, and evaluate a supervised learning algorithm for automatically acquiring high-level visual event definitions from low-level force-dynamic interpretations of video—relieving the user of the need to hand code definitions. We introduce a simple temporal event-description logic called AMA and give algorithms and complexity bounds for the AMA subsumption and generalization problems. A learning method is developed based on these algorithms and applied to the task of learning relational event definitions from video. Experiments show that the learned definitions are competitive with hand-coded ones. Second, we study the problem of relational sequential inference with application to inferring force-dynamic models from video data for use in event learning and recognition. We introduce two frameworks for this problem that provide different approaches to leveraging “nearly sound” logical constraints on a process. We study learning and inference in both frameworks and our empirical results compare favorably to pre-existing hand-coded model reconstructors.

01 Jan 2004
TL;DR: A novel regression technique based on the statistical learning theory, support vector machines (SVM), is investigated in this paper for natural gas short-term load forecasting and results show that SVM provides better prediction accuracy than neural network.
Abstract: Natural gas load forecasting is a key process to the efficient operation of pipeline network. An accurate forecast is required to guarantee a balanced network operation and ensure safe gas supply at a minimum cost. Machine learning techniques have been increasingly applied to load forecasting. A novel regression technique based on the statistical learning theory, support vector machines (SVM), is investigated in this paper for natural gas short-term load forecasting. SVM is based on the principle of structure risk minimization as opposed to the principle of empirical risk minimization in conventional regression techniques. Using a data set with 2 years load values we developed prediction model using SVM to obtain 31 days load predictions. The results on city natural gas short-term load forecasting show that SVM provides better prediction accuracy than neural network. The software package natural gas pipeline networks simulation and load forecasting (NGPNSLF) based on support vector regression prediction has been developed, which has also been applied in practice.

Journal Article
TL;DR: A novel regression technique, called support vector machine (SVM), based on the statistical learning theory is applied in this paper for the prediction of natural gas demands and shows that the prediction accuracy of SVM is better than that of neural network.
Abstract: Machine learning techniques are finding more and more applications in the field of load forecasting A novel regression technique,called support vector machine (SVM),based on the statistical learning theory is applied in this paper for the prediction of natural gas demands Least squares support vector machine (LS-SVM) is a kind of SVM that has different cost function with respect to SVM SVM is based on the principle of structure risk minimization as opposed to the principle of empirical risk minimization supported by conventional regression techniques The prediction result shows that the prediction accuracy of SVM is better than that of neural network Thus,SVM appears to be a very promising prediction tool The software package NGPSLF based on SVM prediction has been put into practical business application

Proceedings ArticleDOI
15 Jun 2004
TL;DR: A novel regression technique, called Support Vector Machines (SVM), based on the statistical learning theory is explored and the prediction result shows that prediction accuracy of SVM is better than that of neural network.
Abstract: Machine learning techniques are finding more and more applications in the field of load forecasting. In this paper a novel regression technique, called Support Vector Machines (SVM), based on the statistical learning theory is explored. SVM is hased on the principle of Structure Risk Minimization as opposed to the principle of Empirical Risk Minimization supported by conventional regression techniques. The natural gas load data in Xi'an city in 2001 and 2002 are used in this study to demonstrate the forecasting capabilities of SVM. The result is compared with that of neural network based model for 7-lead day forecusting. The prediction result shows that prediction accuracy of SVM is better than that of neural network. Thus, SVM appears to he a very promising prediction tool. The software package NGPSLF based on support vector regression (SVR) also has been gone into practical business application.

Proceedings ArticleDOI
26 Aug 2004
TL;DR: In this paper, the authors discuss the bounds on the risk for loss function about fuzzy examples and then estimate the rate of convergence of statistical learning theory with respect to fuzzy examples, which is a recently developed new theory for pattern recognition.
Abstract: Statistical learning theory, a recently developed new theory for pattern recognition, is a small sample statistics proposed by Vapnik et al, which deals mainly with the statistical principles when the samples are limited. The bounds on the rate of convergence play an important role in the statistical learning theory. We discuss the bounds on the risk for loss function about fuzzy examples and then estimate the rate of convergence.

Journal ArticleDOI
01 Jun 2004
TL;DR: An unbiased implementation of SVC is proposed by introducing a more appropriate “error counting” term, which means the number of classification errors is truly minimized, while the maximal margin solution is obtained in the separable case.
Abstract: Many learning algorithms have been used for data mining applications, including Support Vector Classifiers (SVC), which have shown improved capabilities with respect to other approaches, since they provide a natural mechanism for implementing Structural Risk Minimization (SRM), obtaining machines with good generalization properties. SVC leads to the optimal hyperplane (maximal margin) criterion for separable datasets but, in the nonseparable case, the SVC minimizes the L1 norm of the training errors plus a regularizing term, to control the machine complexity. The L1 norm is chosen because it allows to solve the minimization with a Quadratic Programming (QP) scheme, as in the separable case. But the L1 norm is not truly an "error counting" term as the Empirical Risk Minimization (ERM) inductive principle indicates, leading therefore to a biased solution. This effect is specially severe in low complexity machines, such as linear classifiers or machines with few nodes (neurons, kernels, basis functions). Since one of the main goals in data mining is that of explanation, these reduced architectures are of great interest because they represent the origins of other techniques such as input selection or rule extraction. Training SVMs as accurately as possible in these situations (i.e., without this bias) is, therefore, an interesting goal. We propose here an unbiased implementation of SVC by introducing a more appropriate "error counting" term. This way, the number of classification errors is truly minimized, while the maximal margin solution is obtained in the separable case. QP can no longer be used for solving the new minimization problem, and we apply instead an iterated Weighted Least Squares (WLS) procedure. This modification in the cost function of the Support Vector Machine to solve ERM was not possible up to date given the Quadratic or Linear Programming techniques commonly used, but it is now possible using the iterated WLS formulation. Computer experiments show that the proposed method is superior to the classical approach in the sense that it truly solves the ERM problem.

01 Jan 2004
TL;DR: The algorithm is tested with mammograms of clinical patients and results show that SVM method achieves a higher true positive in comparison with artificial neural network based on the empirical risk minimization, and is valuable for application in clinical engineering.
Abstract: Support vector machine (SVM) is a new statistical learning method. Compared with the classical machine learning methods, the learning discipline of SVM is to minimize the structural risk instead of empirical risk used in the learning discipline of classical methods,and SVM gives better generative performance. Because SVM algorithm is a convex quadratic optimization problem, the local optimal solution is certainly the global optimal one.In this paper,SVM algorithm is applied to detect the micro-calcifications in mammogram for the first time. The algorithm is tested with mammograms of clinical patients and results show that SVM method achieves a higher true positive in comparison with artificial neural network (ANN) based on the empirical risk minimization,and is valuable for application in clinical engineering.

Journal ArticleDOI
TL;DR: It is shown that the proposed 2nd-order PD-type iterative learning control algorithms for linear continuous-time system and linear discrete- time system has robustness in the presence of external disturbances and the convergence accuracy can be improved.
Abstract: In this paper are proposed 2nd-order PD-type iterative learning control algorithms for linear continuous-time system and linear discrete-time system In contrast to conventional methods, the proposed learning algorithms are constructed based on both time-domain performance and iteration-domain performance The convergence of the proposed learning algorithms is proved Also, it is shown that the proposed method has robustness in the presence of external disturbances and the convergence accuracy can be improved A numerical example is provided to show the effectiveness of the proposed algorithms

Journal Article
TL;DR: Cross-validation functionals and their upper bounds are considered and new performance bounds for monotone classifiers are obtained, which are nontrivial for small data sets and do not depend on the family complexity.
Abstract: Cross-validation functionals and their upper bounds are considered that characterize the generalization performance of learning algorithms. The initial data are not assumed to be independent, identically distributed (i.i.d.) or even to be random. The effect of localization of an algorithm family is described, and the concept of a local growth function is introduced. New performance bounds for monotone classifiers are obtained, which are nontrivial for small data sets and do not depend on the family complexity. The learning problem can be described as follows. We are given an object space X , an output space Y , and

Proceedings ArticleDOI
16 Dec 2004
TL;DR: The utility of an empirical risk minimization approach allowing for a straightforward classification treatment of the problem is demonstrated, and the empirical results are competitive with those of the most successful previously published methods.
Abstract: In this paper we study Multiple Instance Learning, a variant of the standard classification problem. We demonstrate the utility of an empirical risk minimization approach allowing for a straightforward classification treatment of the problem. In addition we consider simple data dependent hypothesis classes that allow efficient minimization of the empirical loss function and the development of bounds on the estimation error. Our empirical results are competitive with those of the most successful previously published methods.

Proceedings ArticleDOI
Fang Yong1
15 Jun 2004
TL;DR: A continuous-discrete two-dimensional learning system is established and the proposed error system describes entire dynamics involved in iterative learning including the dynamics between tracking error and variable initial state.
Abstract: In this paper, iterative learning control problem of continuous-time system with variable initial condition is considered. Based on the two-dimensionality of the overall learning system, a continuous-discrete two-dimensional (2D) learning system is established. The proposed error system describes entire dynamics involved in iterative learning including the dynamics between tracking error and variable initial state. According to the 2D system theory, a learning error estimate is given and the design method for learning matrices is also proposed. The simulation results demonstrate the performance of our proposed method.

Proceedings ArticleDOI
26 Aug 2004
TL;DR: This work gives the key theorem when the outputs are corrupted by noise, i.e. noise-free case, and investigates the conditions for consistency of the learning processes based on the empirical risk minimization induction principle.
Abstract: Statistical learning theory has investigated the conditions for consistency of the learning processes based on the empirical risk minimization induction principle. However, it deals with the unrealistic, i.e. noise-free case. We give the key theorem when the outputs are corrupted by noise.

Journal ArticleDOI
TL;DR: The paper studies a method for choosing a projection estimator, based on the principle of penalized empirical risk minimization, for estimating an unknown vector observed in a simple white Gaussian noise model.
Abstract: We consider the problem of estimating an unknown vector observed in a simple white Gaussian noise model For the estimation, a family of projection estimators is used; the problem is to choose, based on observations, the best estimator within this family The paper studies a method for choosing a projection estimator, based on the principle of penalized empirical risk minimization For this estimation method, nonasymptotic inequalities controlling its quadratic risk are given

Journal Article
TL;DR: The simulation result shows that the learning evaluation system can give a good evaluation for students' learning, furthermore, it provides teachers a powerful tool for teaching.
Abstract: Based on the analysis of learning behavior of students, this paper puts forward a learning evaluation system that uses the support vector machine algorithm.The simulation result shows that the system can give a good evaluation for students' learning, furthermore, it provides teachers a powerful tool for teaching.

Book
17 Jun 2004
TL;DR: This chapter discusses the Budgeted Multi-armed Bandit Problem, Reinforcement Learning for Average Reward Zero-Sum Games, and more.
Abstract: Economics and Game Theory.- Towards a Characterization of Polynomial Preference Elicitation with Value Queries in Combinatorial Auctions.- Graphical Economics.- Deterministic Calibration and Nash Equilibrium.- Reinforcement Learning for Average Reward Zero-Sum Games.- OnLine Learning.- Polynomial Time Prediction Strategy with Almost Optimal Mistake Probability.- Minimizing Regret with Label Efficient Prediction.- Regret Bounds for Hierarchical Classification with Linear-Threshold Functions.- Online Geometric Optimization in the Bandit Setting Against an Adaptive Adversary.- Inductive Inference.- Learning Classes of Probabilistic Automata.- On the Learnability of E-pattern Languages over Small Alphabets.- Replacing Limit Learners with Equally Powerful One-Shot Query Learners.- Probabilistic Models.- Concentration Bounds for Unigrams Language Model.- Inferring Mixtures of Markov Chains.- Boolean Function Learning.- PExact = Exact Learning.- Learning a Hidden Graph Using O(log n) Queries Per Edge.- Toward Attribute Efficient Learning of Decision Lists and Parities.- Empirical Processes.- Learning Over Compact Metric Spaces.- A Function Representation for Learning in Banach Spaces.- Local Complexities for Empirical Risk Minimization.- Model Selection by Bootstrap Penalization for Classification.- MDL.- Convergence of Discrete MDL for Sequential Prediction.- On the Convergence of MDL Density Estimation.- Suboptimal Behavior of Bayes and MDL in Classification Under Misspecification.- Generalisation I.- Learning Intersections of Halfspaces with a Margin.- A General Convergence Theorem for the Decomposition Method.- Generalisation II.- Oracle Bounds and Exact Algorithm for Dyadic Classification Trees.- An Improved VC Dimension Bound for Sparse Polynomials.- A New PAC Bound for Intersection-Closed Concept Classes.- Clustering and Distributed Learning.- A Framework for Statistical Clustering with a Constant Time Approximation Algorithms for K-Median Clustering.- Data Dependent Risk Bounds for Hierarchical Mixture of Experts Classifiers.- Consistency in Models for Communication Constrained Distributed Learning.- On the Convergence of Spectral Clustering on Random Samples: The Normalized Case.- Boosting.- Performance Guarantees for Regularized Maximum Entropy Density Estimation.- Learning Monotonic Linear Functions.- Boosting Based on a Smooth Margin.- Kernels and Probabilities.- Bayesian Networks and Inner Product Spaces.- An Inequality for Nearly Log-Concave Distributions with Applications to Learning.- Bayes and Tukey Meet at the Center Point.- Sparseness Versus Estimating Conditional Probabilities: Some Asymptotic Results.- Kernels and Kernel Matrices.- A Statistical Mechanics Analysis of Gram Matrix Eigenvalue Spectra.- Statistical Properties of Kernel Principal Component Analysis.- Kernelizing Sorting, Permutation, and Alignment for Minimum Volume PCA.- Regularization and Semi-supervised Learning on Large Graphs.- Open Problems.- Perceptron-Like Performance for Intersections of Halfspaces.- The Optimal PAC Algorithm.- The Budgeted Multi-armed Bandit Problem.

Journal Article
TL;DR: Two new-type initial state learning schemes are developed for irregular linear discrete-time system with non-zero initial error for dynamic systems and simulation results show that the proposed algorithms are very effective.
Abstract: Iterative learning control techniques are widely applied to robotic systems. Most of the proposed algorithms can be only used when the initial learning error are zero. However, in practical engineering, the initial learning states will shift from the desired initial states. As a result, it is necessary to develop some new learning algorithms to deal with such situations. Two new-type initial state learning schemes are developed for irregular linear discrete-time system with non-zero initial error. Based on 2-D system theory, iterative learning control algorithms for arbitrary initial state are analyzed on 2-D notion. The proposed algorithms can be applied to practical situations without assumption of zero initial error for dynamic systems. The simulation results show that the proposed algorithms are very effective.