scispace - formally typeset
Search or ask a question

Showing papers on "Function approximation published in 2013"


Posted Content
TL;DR: Generative stochastic networks (GSN) as discussed by the authors learn the transition operator of a Markov chain whose stationary distribution estimates the data distribution, which is an alternative to maximum likelihood.
Abstract: We introduce a novel training principle for probabilistic models that is an alternative to maximum likelihood. The proposed Generative Stochastic Networks (GSN) framework is based on learning the transition operator of a Markov chain whose stationary distribution estimates the data distribution. The transition distribution of the Markov chain is conditional on the previous state, generally involving a small move, so this conditional distribution has fewer dominant modes, being unimodal in the limit of small moves. Thus, it is easier to learn because it is easier to approximate its partition function, more like learning to perform supervised function approximation, with gradients that can be obtained by backprop. We provide theorems that generalize recent work on the probabilistic interpretation of denoising autoencoders and obtain along the way an interesting justification for dependency networks and generalized pseudolikelihood, along with a definition of an appropriate joint distribution and sampling mechanism even when the conditionals are not consistent. GSNs can be used with missing inputs and can be used to sample subsets of variables given the rest. We validate these theoretical results with experiments on two image datasets using an architecture that mimics the Deep Boltzmann Machine Gibbs sampler but allows training to proceed with simple backprop, without the need for layerwise pretraining.

323 citations


Journal ArticleDOI
TL;DR: Empirical validation on a number of synthetic and real-life learning problems confirms that the performance of Incremental Sparse Spectrum Gaussian Process Regression is superior with respect to the popular Locally Weighted Projection Regression, while computational requirements are found to be significantly lower.

105 citations


Journal ArticleDOI
TL;DR: This work proves convergence of an approximate dynamic programming algorithm for a class of high-dimensional stochastic control problems linked by a scalar storage device, given a technical condition.
Abstract: We prove convergence of an approximate dynamic programming algorithm for a class of high-dimensional stochastic control problems linked by a scalar storage device, given a technical condition. Our problem is motivated by the problem of optimizing energy flows for a power grid supported by grid-level storage. The problem is formulated as a stochastic, dynamic program, where we estimate the value of resources in storage using a piecewise linear value function approximation. Given the technical condition, we provide a rigorous convergence proof for an approximate dynamic programming algorithm, which can capture the presence of both the amount of energy held in storage as well as other exogenous variables. Our algorithm exploits the natural concavity of the problem to avoid any need for explicit exploration policies.

84 citations


Proceedings ArticleDOI
16 Apr 2013
TL;DR: Two methods of deriving a descriptive final cost function to assist model predictive control (MPC) in selecting a good policy without having to plan as far into the future or having to fine-tune delicate cost functions are explored.
Abstract: Both global methods and on-line trajectory optimization methods are powerful techniques for solving optimal control problems; however, each has limitations. In order to mitigate the undesirable properties of each, we explore the possibility of combining the two. We explore two methods of deriving a descriptive final cost function to assist model predictive control (MPC) in selecting a good policy without having to plan as far into the future or having to fine-tune delicate cost functions. First, we exploit the large amount of data which is generated in MPC simulations (based on the receding horizon iterative LQG method) to learn, off-line, the global optimal value function for use as a final cost. We demonstrate that, while the global function approximation matches the value function well on some problems, there is relatively little improvement to the original MPC. Alternatively, we solve the Bellman equation directly using aggregation methods for linearly-solvable Markov Decision Processes to obtain an approximation to the value function and the optimal policy. Using both pieces of information in the MPC framework, we find controller performance of similar quality to MPC alone with long horizon, but now we may drastically shorten the horizon. Implementation of these methods shows that Bellman equation-based methods and on-line trajectory methods can be combined in real applications to the benefit of both.

83 citations


Journal ArticleDOI
TL;DR: Two different methods to solve the coupled Klein–Gordon–Zakharov (KGZ) equations are proposed: the Differential Quadrature (DQ) and Globally Radial Basis Functions (GRBFs) methods.

74 citations


Journal ArticleDOI
TL;DR: This survey reviews state-of-the-art methods for (parametric) value function approximation by grouping them into three main categories: bootstrapping, residual, and projected fixed-point approaches.
Abstract: Reinforcement learning (RL) is a machine learning answer to the optimal control problem. It consists of learning an optimal control policy through interactions with the system to be controlled, the quality of this policy being quantified by the so-called value function. A recurrent subtopic of RL concerns computing an approximation of this value function when the system is too large for an exact representation. This survey reviews state-of-the-art methods for (parametric) value function approximation by grouping them into three main categories: bootstrapping, residual, and projected fixed-point approaches. Related algorithms are derived by considering one of the associated cost functions and a specific minimization method, generally a stochastic gradient descent or a recursive least-squares approach.

74 citations


Journal ArticleDOI
TL;DR: An optimal robust PID-learning algorithm is developed that greatly facilitates the analysis and design of robust learning algorithms for multiple-input-multiple-output (MIMO) FNNs using robust control methods.
Abstract: The training problem of feedforward neural networks (FNNs) is formulated into a proportional integral and derivative (PID) control problem of a linear discrete dynamic system in terms of the estimation error. The robust control approach greatly facilitates the analysis and design of robust learning algorithms for multiple-input-multiple-output (MIMO) FNNs using robust control methods. The drawbacks of some existing learning algorithms can therefore be revealed clearly, and an optimal robust PID-learning algorithm is developed. The optimal learning parameters can be found by utilizing linear matrix inequality optimization techniques. Theoretical analysis and examples including function approximation, system identification, exclusive-or (XOR) and encoder problems are provided to illustrate the results.

72 citations


Proceedings ArticleDOI
17 Jul 2013
TL;DR: This work illustrates how approximate dynamic programing can be utilized to address problems of stochastic reachability in infinite state and control spaces and approximate the value function on a linear combination of radial basis functions.
Abstract: In this work we illustrate how approximate dynamic programing can be utilized to address problems of stochastic reachability in infinite state and control spaces. In particular we focus on the reach-avoid problem and approximate the value function on a linear combination of radial basis functions. In this way we get significant computational advantages with which we obtain tractable solutions to problems that cannot be solved via generic space gridding due to the curse of dimensionality. Numerical simulations indicate that control policies coming as a result of approximating the value function of stochastic reachability problems achieve close to optimal performance.

67 citations


Journal ArticleDOI
TL;DR: Neural network with similar but yet different activation function-hyper basis function (HBF) is presented and sequential learning algorithm for HBF neural network is modified that exploits the concept of neuron's significance and allows growing and pruning of HBF neuron during learning process.

66 citations


Journal ArticleDOI
TL;DR: In this article, an adaptive fault-tolerant (AFT) control problem is investigated for strict-feedback non-linear systems with unknown time-delayed nonlinear faults, where error surfaces restricted by prescribed performance bounds are employed to guarantee the transient performance at the moment when faults with unknown occurrence time and magnitude occur.
Abstract: An approximation-based adaptive fault-tolerant (AFT) control problem is investigated for strict-feedback non-linear systems with unknown time-delayed non-linear faults. The error surfaces restricted by prescribed performance bounds are employed to guarantee the transient performance at the moment when faults with unknown occurrence time and magnitude occur. Based on the surfaces, we design a memoryless AFT control system where the function approximation technique using neural networks is applied to adaptively approximate unknown non-linear effects and changes in model dynamics because of the time-delayed faults. It is shown from Lyapunov stability theorem that the tracking error of the proposed control system is preserved within the prescribed performance bound and converges to an adjustable neighbourhood of the origin regardless of unknown time-delayed non-linear faults.

58 citations


Journal ArticleDOI
TL;DR: Among the three neural networks tested, Radial Basis Function (RBF) neural network is superior in terms of speed and accuracy for function approximation in comparison with Back Propagation (BP) and Generalized Regression Neural Network (GRNN).

Journal ArticleDOI
TL;DR: Li et al. as mentioned in this paper proposed an ELM with tunable activation function (TAF-ELM) learning algorithm, which determines its activation functions dynamically by means of the differential evolution algorithm based on the input data.
Abstract: In this paper, we propose an extreme learning machine (ELM) with tunable activation function (TAF-ELM) learning algorithm, which determines its activation functions dynamically by means of the differential evolution algorithm based on the input data. The main objective is to overcome the problem dependence of fixed slop of the activation function in ELM. We mainly considered the issue of processing of benchmark problems on function approximation and pattern classification. Compared with ELM and E-ELM learning algorithms with the same network size or compact network configuration, the proposed algorithm has improved generalization performance with good accuracy. In addition, the proposed algorithm also has very good performance in the TAF neural networks learning algorithms.

Journal ArticleDOI
TL;DR: This paper introduces an error function that contains all PMP conditions, and uses trial solutions for the trajectory function, control function and the Lagrange multipliers to attain the solution of optimal control problems.
Abstract: This paper attempts to propose a new method based on capabilities of artificial neural networks, in function approximation, to attain the solution of optimal control problems. To do so, we try to approximate the solution of Hamiltonian conditions based on the Pontryagin minimum principle (PMP). For this purpose, we introduce an error function that contains all PMP conditions. In the proposed error function, we used trial solutions for the trajectory function, control function and the Lagrange multipliers. These trial solutions are constructed by using neurons. Then, we minimize the error function that contains just the weights of the trial solutions. Substituting the optimal values of the weights in the trial solutions, we obtain the optimal trajectory function, optimal control function and the optimal Lagrange multipliers.

Journal ArticleDOI
01 May 2013
TL;DR: The experimental results show that the proposed self-constructing LW-GRBFNIS method not only creates optimal hidden nodes but also effectively mitigates the noise and outliers problems.
Abstract: This paper proposes a novel self-constructing least-Wilcoxon generalized Radial Basis Function Neural-Fuzzy System (LW-GRBFNFS) and its applications to non-linear function approximation and chaos time sequence prediction. In general, the hidden layer parameters of the antecedent part of most traditional RBFNFS are decided in advance and the output weights of the consequent part are evaluated by least square estimation. The hidden layer structure of the RBFNFS is lack of flexibility because the structure is fixed and cannot be adjusted effectively according to the dynamic behavior of the system. Furthermore, the resultant performance of using least square estimation for output weights is often weakened by the noise and outliers. This paper creates a self-constructing scenario for generating antecedent part of RBFNFS with particle swarm optimizer (PSO). For training the consequent part of RBFNFS, instead of traditional least square (LS) estimation, least-Wilcoxon (LW) norm is employed in the proposed approach to do the estimation. As is well known in statistics, the resulting linear function by using the rank-based LW norm approximation to linear function problems is usually robust against (or insensitive to) noises and outliers and therefore increases the accuracy of the output weights of RBFNFS. Several nonlinear functions approximation and chaotic time series prediction problems are used to verify the efficiency of self-constructing LW-GRBFNIS proposed in this paper. The experimental results show that the proposed method not only creates optimal hidden nodes but also effectively mitigates the noise and outliers problems.

Posted Content
TL;DR: A computationally efficient and robust algorithm for generating pseudo-random samples from a broad class of smooth probability distributions in one and two dimensions based on inverse transform sampling with a polynomial approximation scheme using Chebyshev polynomials, ChebysHEv grids, and low rank function approximation is developed.
Abstract: We develop a computationally efficient and robust algorithm for generating pseudo-random samples from a broad class of smooth probability distributions in one and two dimensions. The algorithm is based on inverse transform sampling with a polynomial approximation scheme using Chebyshev polynomials, Chebyshev grids, and low rank function approximation. Numerical experiments demonstrate that our algorithm outperforms existing approaches.

01 Jan 2013
TL;DR: Compared with ELM and E-ELM learning algorithms with the same network size or compact network configuration, the proposed algorithm has improved generalization performance with good accuracy and has very good performance in the TAF neural networks learning algorithms.
Abstract: In this paper, we propose an extreme learning machine (ELM) with tunable activation function (TAF- ELM) learning algorithm, which determines its activation functions dynamically by means of the differential evolu- tion algorithm based on the input data. The main objective is to overcome the problem dependence of fixed slop of the activation function in ELM. We mainly considered the issue of processing of benchmark problems on function approximation and pattern classification. Compared with ELM and E-ELM learning algorithms with the same net- work size or compact network configuration, the proposed algorithm has improved generalization performance with good accuracy. In addition, the proposed algorithm also has very good performance in the TAF neural networks learning algorithms.

Journal ArticleDOI
TL;DR: The main idea is to optimize, simultaneously, the weights and activation function used in a Multilayer Perceptron (MLP), through an approach that combines the advantages of simulated annealing, tabu search and a local learning algorithm.
Abstract: The use of neural network models for time series forecasting has been motivated by experimental results that indicate high capacity for function approximation with good accuracy. Generally, these models use activation functions with fixed parameters. However, it is known that the choice of activation function strongly influences the complexity and neural network performance and that a limited number of activation functions has been used in general. We describe the use of an asymmetric activation functions family with free parameter for neural networks. We prove that the activation functions family defined, satisfies the requirements of the universal approximation theorem We present a methodology for global optimization of the activation functions family with free parameter and the connections between the processing units of the neural network. The main idea is to optimize, simultaneously, the weights and activation function used in a Multilayer Perceptron (MLP), through an approach that combines the advantages of simulated annealing, tabu search and a local learning algorithm. We have chosen two local learning algorithms: the backpropagation with momentum (BPM) and Levenberg-Marquardt (LM). The overall purpose is to improve performance in time series forecasting.

Journal ArticleDOI
TL;DR: This paper proposes a least square regularized regression algorithm in sum space of reproducing kernel Hilbert spaces (RKHSs) for nonflat function approximation, and obtains the solution of the algorithm by solving a system of linear equations.
Abstract: This paper proposes a least square regularized regression algorithm in sum space of reproducing kernel Hilbert spaces (RKHSs) for nonflat function approximation, and obtains the solution of the algorithm by solving a system of linear equations. This algorithm can approximate the low- and high-frequency component of the target function with large and small scale kernels, respectively. The convergence and learning rate are analyzed. We measure the complexity of the sum space by its covering number and demonstrate that the covering number can be bounded by the product of the covering numbers of basic RKHSs. For sum space of RKHSs with Gaussian kernels, by choosing appropriate parameters, we tradeoff the sample error and regularization error, and obtain a polynomial learning rate, which is better than that in any single RKHS. The utility of this method is illustrated with two simulated data sets and five real-life databases.

Journal ArticleDOI
TL;DR: An online selective kernel-based temporal difference (OSKTD) learning algorithm is proposed to deal with large scale and/or continuous reinforcement learning problems and can reach a competitive ultimate optima compared with the up-to-date algorithms.
Abstract: In this paper, an online selective kernel-based temporal difference (OSKTD) learning algorithm is proposed to deal with large scale and/or continuous reinforcement learning problems OSKTD includes two online procedures: online sparsification and parameter updating for the selective kernel-based value function A new sparsification method (ie, a kernel distance-based online sparsification method) is proposed based on selective ensemble learning, which is computationally less complex compared with other sparsification methods With the proposed sparsification method, the sparsified dictionary of samples is constructed online by checking if a sample needs to be added to the sparsified dictionary In addition, based on local validity, a selective kernel-based value function is proposed to select the best samples from the sample dictionary for the selective kernel-based value function approximator The parameters of the selective kernel-based value function are iteratively updated by using the temporal difference (TD) learning algorithm combined with the gradient descent technique The complexity of the online sparsification procedure in the OSKTD algorithm is O(n) In addition, two typical experiments (Maze and Mountain Car) are used to compare with both traditional and up-to-date O(n) algorithms (GTD, GTD2, and TDC using the kernel-based value function), and the results demonstrate the effectiveness of our proposed algorithm In the Maze problem, OSKTD converges to an optimal policy and converges faster than both traditional and up-to-date algorithms In the Mountain Car problem, OSKTD converges, requires less computation time compared with other sparsification methods, gets a better local optima than the traditional algorithms, and converges much faster than the up-to-date algorithms In addition, OSKTD can reach a competitive ultimate optima compared with the up-to-date algorithms

Proceedings ArticleDOI
20 Jun 2013
TL;DR: The εDEkr, which is the combination of the ε constrained method and the estimated comparison using kernel regression, is a very efficient constrained optimization algorithm that can find high-quality solutions in a very small number of function evaluations.
Abstract: We have proposed to utilize a rough approximation model, which is an approximation model with low accuracy and without learning process, to reduce the number of function evaluations in unconstrained optimization. Although the approximation errors between the true function values and the approximation values estimated by the rough approximation model are not small, the rough model can estimate the order relation of two points with fair accuracy. In order to use this nature of the rough model, we have proposed estimated comparison which omits the function evaluations when the result of the comparison can be judged by approximation values. In this study, we propose to utilize the estimated comparison in constrained optimization and propose the eDEkr, which is the combination of the e constrained method and the estimated comparison using kernel regression. The eDEkr is a very efficient constrained optimization algorithm that can find high-quality solutions in a very small number of function evaluations. It is shown that the eDEkr can find near optimal solutions stably in a very small number of function evaluations compared with various other methods on well-known nonlinear constrained problems.

Journal ArticleDOI
TL;DR: A novel regressor is proposed for the simultaneous learning of a function and its derivatives, termed as TSVR of a Function and its Derivatives, which demonstrates its effectiveness over other existing approaches in terms of improving the estimation accuracy and reducing run time complexity.
Abstract: Twin support vector regression (TSVR) determines a pair of $$\epsilon$$ -insensitive up- and down-bound functions by solving two related support vector machine-type problems, each of which is smaller than that in a classical SVR. On the lines of TSVR, we have proposed a novel regressor for the simultaneous learning of a function and its derivatives, termed as TSVR of a Function and its Derivatives. Results over several functions of more than one variable demonstrate its effectiveness over other existing approaches in terms of improving the estimation accuracy and reducing run time complexity.

Journal Article
TL;DR: In this article, a fuzzy neural network (FNN) based discrete adaptive iterative learning controller (AILC) is proposed for a class of discrete-time uncertain nonlinear plants which can repeat a given task over a finite time sequence.
Abstract: In this paper, a fuzzy neural network (FNN) based discrete adaptive iterative learning controller (AILC) is proposed for a class of discrete-time uncertain nonlinear plants which can repeat a given task over a finite time sequence. Compared with the existing discrete AILC schemes, the proposed strategy can be applied to the discrete-time uncertain nonlinear plants with not only initial resetting errors and iteration- varying desired trajectory, but also random bounded disturbances and unknown non-Lipschitz plant nonlinearities. Two FNNs are used as approximators to compensate for the unknown plant nonlinearities. To overcome the function approximation error and possibly large random bounded disturbance, a time-varying boundary layer is introduced to design an auxiliary error function. The auxiliary error function is then utilized to derive the adaptive laws since the optimal FNN parameters for a good function approximation and the optimal width of time-varying boundary layer are unavailable. By using a Lyapunov like analysis, we show that the closed-loop is stable in the sense that the adjustable parameters and internal signals are bounded for all the iterations. Furthermore, learning performance is guaranteed in the sense that the norm of output tracking error vector will asymptotically converge to a residual set which is bounded by the width of boundary layer.

Proceedings ArticleDOI
18 Mar 2013
TL;DR: A QR-decomposition hardware implementation that processes complex calculations in the logarithmic number system using nonuniform piecewise and multiplier-less function approximation is proposed and the results are compared to default CORDIC-based architectures.
Abstract: In this paper we propose a QR-decomposition hardware implementation that processes complex calculations in the logarithmic number system. Thus, low complexity numeric format converters are installed, using nonuniform piecewise and multiplier-less function approximation. The proposed algorithm is simulated with several different configurations in a downlink precoding environment for 4×4 and 8×8 multi-antenna wireless communication systems. In addition, the results are compared to default CORDIC-based architectures. In a second step, HDL implementation as well as logical and physical CMOS synthesis are performed. The comparison to actual references highlight our approach as highly efficient in terms of hardware complexity and accuracy.

Journal ArticleDOI
01 May 2013
TL;DR: This paper proposes a new optimization algorithm for a single-hidden layer FNN based on the convex combination algorithm for massaging information in the hidden layer which has the advantage over GA which requires a lot of preprocessing works in breaking down the data into a sequence of binary codes before learning or mutation can apply.
Abstract: Feedforward neural networks are the most commonly used function approximation techniques in neural networks By the universal approximation theorem, it is clear that a single-hidden layer feedforward neural network (FNN) is sufficient to approximate the corresponding desired outputs arbitrarily close Some researchers use genetic algorithms (GAs) to explore the global optimal solution of the FNN structure However, it is rather time consuming to use GA for the training of FNN In this paper, we propose a new optimization algorithm for a single-hidden layer FNN The method is based on the convex combination algorithm for massaging information in the hidden layer In fact, this technique explores a continuum idea which combines the classic mutation and crossover strategies in GA together The proposed method has the advantage over GA which requires a lot of preprocessing works in breaking down the data into a sequence of binary codes before learning or mutation can apply Also, we set up a new error function to measure the performance of the FNN and obtain the optimal choice of the connection weights and thus the nonlinear optimization problem can be solved directly Several computational experiments are used to illustrate the proposed algorithm, which has good exploration and exploitation capabilities in search of the optimal weight for single hidden layer FNNs

Journal ArticleDOI
TL;DR: The result of this study shows that the function approximation by neural networks can reduce the duration of optimization procedure.
Abstract: An artificial neural network (ANN) is adjusted to make analytical approximation of objective function for a specific structural acoustic application. It is used as the replacement of the main real objective function during the optimization process. The goal of optimization is to find the best geometry modification of the considered model which is supposed to produce lower values of the radiated sound power levels. The result of this study shows that the function approximation by neural networks can reduce the duration of optimization procedure. Furthermore, the tuning of ANN internal parameter settings is a real challenge to be considered.

Journal ArticleDOI
Shuo Ding1, Qing Hui Wu1
TL;DR: Simulation results indicate that for small scaled and medium scaled networks, LM optimization algorithm has the best approximation ability, followed by Quasi-Newton algorithm, conjugate gradient method, resilient BP algorithm, adaptive learning rate algorithm.
Abstract: BP neural networks are widely used and the algorithms are various. This paper studies the advantages and disadvantages of improved algorithms of five typical BP networks, based on artificial neural network theories. First, the learning processes of improved algorithms of the five typical BP networks are elaborated on mathematically. Then a specific network is designed on the platform of MATLAB 7.0 to conduct approximation test for a given nonlinear function. At last, a comparison is made between the training speeds and memory consumption of the five BP networks. The simulation results indicate that for small scaled and medium scaled networks, LM optimization algorithm has the best approximation ability, followed by Quasi-Newton algorithm, conjugate gradient method, resilient BP algorithm, adaptive learning rate algorithm. Keywords: BP neural network; Improved algorithm; Function approximation; MATLAB

Journal ArticleDOI
TL;DR: The results show that the recursive least-squares adaptive control achieves better robustness as measured by a time-delay margin, while the least-Squares gradient adaptive cont...
Abstract: This paper presents a model-reference adaptive control approach for systems with unstructured uncertainty based on two least-squares parameter estimation methods: gradient-based method and recursive least-squares method. The unstructured uncertainty is approximated by Chebyshev orthogonal polynomial basis functions. The use of orthogonal basis functions improves the function approximation significantly and enables better convergence of parameter estimates. The least-squares gradient adaptive control achieves superior parameter convergence as compared to the standard model-reference adaptive control. Flight control simulations were conducted with four adaptive controllers: least-squares gradient adaptive control, recursive least-squares adaptive control, standard model-reference adaptive control, and neural-network adaptive control. The results show that the recursive least-squares adaptive control achieves better robustness as measured by a time-delay margin, while the least-squares gradient adaptive cont...

Journal ArticleDOI
TL;DR: Performance comparison with the approaches reported in the literature in approximating the same benchmark piecewise function verified the superiority of the proposed MPSDFCM strategy.
Abstract: Specifying the number and locations of the translation vectors for wavelet neural networks (WNNs) is of paramount significance as the quality of approximation may be drastically reduced if initialization of WNNs parameters was not done judiciously. In this paper, an enhanced fuzzy C-means algorithm, specifically the modified point symmetry–based fuzzy C-means algorithm (MPSDFCM), was proposed, in order to determine the optimal initial locations for the translation vectors. The proposed neural network models were then employed in approximating five different nonlinear continuous functions. Assessment analysis showed that integration of the MPSDFCM in the learning phase of WNNs would lead to a significant improvement in WNNs prediction accuracy. Performance comparison with the approaches reported in the literature in approximating the same benchmark piecewise function verified the superiority of the proposed strategy.

Journal ArticleDOI
TL;DR: A new algorithm is proposed that improves the algorithm accuracy and robustness by employing an M-estimator cost function to decide on the best estimated model from the randomly selected samples and improves the time performance of the algorithm by utilizing a statistical pretest based on Wald's sequential probability ratio test.
Abstract: This paper addresses the problem of fitting a functional model to data corrupted with outliers using a multilayered feed-forward neural network. Although it is of high importance in practical applications, this problem has not received careful attention from the neural network research community. One recent approach to solving this problem is to use a neural network training algorithm based on the random sample consensus (RANSAC) framework. This paper proposes a new algorithm that offers two enhancements over the original RANSAC algorithm. The first one improves the algorithm accuracy and robustness by employing an M-estimator cost function to decide on the best estimated model from the randomly selected samples. The other one improves the time performance of the algorithm by utilizing a statistical pretest based on Wald's sequential probability ratio test. The proposed algorithm is successfully evaluated on synthetic and real data, contaminated with varying degrees of outliers, and compared with existing neural network training algorithms.

Book ChapterDOI
07 Feb 2013
TL;DR: This paper discusses -policy iteration, a method for exact and approximate dynamic programming, and discusses various implementations, which have advantages over well-established PI methods that use LSPE, LSTD, or TD for policy evaluation with cost function approximation.
Abstract: In this paper we discuss -policy iteration, a method for exact and approximate dynamic programming. It is intermediate between the classical value iteration (VI) and policy iteration (PI) methods, and it is closely related to optimistic (also known as modied) PI, whereby each policy evaluation is done approximately, using a nite number of VI. We review the theory of the method and associated questions of bias and exploration arising in simulation-based cost function approximation. We then discuss various implementations, which oer advantages over well-established PI methods that use LSPE( ), LSTD( ), or TD( ) for policy evaluation with cost function approximation. One of these implementations is based on a new simulation scheme, called geometric sampling, which uses multiple short trajectories rather than a single innitely long trajectory.