scispace - formally typeset
Search or ask a question

Showing papers on "Convergence (routing) published in 1992"


Journal ArticleDOI
TL;DR: Convergence with probability one is proved for a variety of classical optimization and identification problems and it is demonstrated for these problems that the proposed algorithm achieves the highest possible rate of convergence.
Abstract: A new recursive algorithm of stochastic approximation type with the averaging of trajectories is investigated. Convergence with probability one is proved for a variety of classical optimization and identification problems. It is also demonstrated for these problems that the proposed algorithm achieves the highest possible rate of convergence.

1,970 citations


Journal ArticleDOI
TL;DR: The modified back-propagation method consists of a simple change in the total error-of-performance function that is to be minimized by the algorithm, and the final approach to the desired response function is accelerated by an amount that can be predicted analytically.

398 citations


Journal ArticleDOI
TL;DR: In this paper, conditions for weak convergence of stochastic integrals are established under the assumption that the innovations are strong mixing with uniformly bounded 2-h moments, and several applications of the results are given, relevant for the theories of estimation with I(1) processes, I(2) processes with nonstationary variances, near-integrated processes, and continuous time approximations.
Abstract: This paper provides conditions to establish the weak convergence of stochastic integrals. The theorems are proved under the assumption that the innovations are strong mixing with uniformly bounded 2-h moments. Several applications of the results are given, relevant for the theories of estimation with I(1) processes, I(2) processes, processes with nonstationary variances, near-integrated processes, and continuous time approximations.

331 citations



Journal ArticleDOI
TL;DR: Watkins' theorem that Q-learning, his closely related prediction and action learning method, converges with probability one is adapted to demonstrate this strong form of convergence for a slightly modified version of TD.
Abstract: The method of temporal differences (TD) is one way of making consistent predictions about the future. This paper uses some analysis of Watkins (1989) to extend a convergence theorem due to Sutton (1988) from the case which only uses information from adjacent time steps to that involving information from arbitrary ones. It also considers how this version of TD behaves in the face of linearly dependent representations for states—demonstrating that it still converges, but to a different answer from the least mean squares algorithm. Finally it adapts Watkins' theorem that \cal Q-learning, his closely related prediction and action learning method, converges with probability one, to demonstrate this strong form of convergence for a slightly modified version of TD.

282 citations


Journal ArticleDOI
TL;DR: Dynamic Parameter Encoding is shown to be empirically effective and amenable to analysis; the problem of premature convergence in GAs is explored through two convergence models.
Abstract: The common use of static binary place-value codes for real-valued parameters of the phenotype in Holland's genetic algorithm (GA) forces either the sacrifice of representational precision for efficiency of search or vice versa. Dynamic Parameter Encoding (DPE) is a mechanism that avoids this dilemma by using convergence statistics derived from the GA population to adaptively control the mapping from fixed-length binary genes to real values. DPE is shown to be empirically effective and amenable to analysiss we explore the problem of premature convergence in GAs through two convergence models.

239 citations


Proceedings ArticleDOI
31 Aug 1992
TL;DR: The authors propose a new methodology for creating the first automatically adapting learning rates that achieve the optimal rate of convergence for stochastic gradient descent, which agrees with theoretical expectations that drift can be used to determine whether the crucial parameter c is large enough.
Abstract: The authors propose a new methodology for creating the first automatically adapting learning rates that achieve the optimal rate of convergence for stochastic gradient descent. Empirical tests agree with theoretical expectations that drift can be used to determine whether the crucial parameter c is large enough. Using this statistic, it will be possible to produce the first adaptive learning rates which converge at optimal speed. >

203 citations


Journal ArticleDOI
TL;DR: An iterative learning control scheme is presented for a class of nonlinear dynamic systems which includes holonomic systems as its subset and neither uses derivative terms of feedback errors nor assumes external input perturbations as a prerequisite.

189 citations


Journal ArticleDOI
TL;DR: It is shown that the distance to a feasible point near the solution set can be bounded by the norm of a natural residual at that point, and this bound is used to prove linear convergence of a matrix splitting algorithm for solving the symmetric case of the affine variational inequality problem.
Abstract: Consider the affine variational inequality problem. It is shown that the distance to the solution set from a feasible point near the solution set can be bounded by the norm of a natural residual at that point. This bound is then used to prove linear convergence of a matrix splitting algorithm for solving the symmetric case of the problem. This latter result improves upon a recent result of Luo and Tseng that further assumes the problem to be monotone.

184 citations


Journal ArticleDOI
TL;DR: A general forgetting algorithm that contains most existing forgetting schemes as special cases and is applied to a specific algorithm with selective forgetting, which is non-uniform in time and space.
Abstract: Fn the first part of this paper, a general forgetting algorithm is formulated and analysed. It contains most existing forgetting schemes as special cases. Conditions are given ensuring that the basic convergence properties will hold. In the second part of the paper, the results are applied to a specific algorithm with selective forgetting. Here, the forgetting is non-uniform in time and space. The theoretical analysis is supported by a simulation example demonstrating the practical performance of this algorithm.

121 citations


Journal ArticleDOI
TL;DR: Bounds on the learning rate are developed under which exponential convergence of the weights to their correct values is proved for a class of matrix algebra problems that includes linear equation solving, matrix inversion, and Lyapunov equation solving.
Abstract: A class of feedforward neural networks, structured networks, has recently been introduced as a method for solving matrix algebra problems in an inherently parallel formulation. A convergence analysis for the training of structured networks is presented. Since the learning techniques used in structured networks are also employed in the training of neural networks, the issue of convergence is discussed not only from a numerical algebra perspective but also as a means of deriving insight into connectionist learning. Bounds on the learning rate are developed under which exponential convergence of the weights to their correct values is proved for a class of matrix algebra problems that includes linear equation solving, matrix inversion, and Lyapunov equation solving. For a special class of problems, the orthogonalized back-propagation algorithm, an optimal recursive update law for minimizing a least-squares cost functional, is introduced. It guarantees exact convergence in one epoch. Several learning issues are investigated. >

Book ChapterDOI
01 Jan 1992
TL;DR: In this article, sufficient and necessary conditions for the convergence of Newton's method based on generalized derivatives are presented, which require uniform injectivity of the derivatives as well as uniform high-order approximation of the original locally Lipschitz function along rays through the solution.
Abstract: This paper presents sufficient and necessary conditions for the convergence of Newton’s method based on generalized derivatives. These conditions require uniform injectivity of the derivatives as well as uniform high-order approximation of the original locally Lipschitz function along rays through the solution. Our approach permits to determine approximate solutions of the Newton subproblems and to use such concepts of derivatives for nonsmooth functions, multivalued or not, as directional and B-derivatives, contingent derivatives, generalized Jacobians and others. Furthermore, we ensure solvability of the subproblems via surjecivi-ty of the derivatives and verify a Kantorovich-type convergence theorem.

Journal ArticleDOI
TL;DR: A penalty function method approach for solving a constrained bilevel optimization problem is proposed that is applicable to the non-singleton lower-level reaction set case.
Abstract: A penalty function method approach for solving a constrained bilevel optimization problem is proposed. In the algorithm, both the upper level and the lower level problems are approximated by minimization problems of augmented objective functions. A convergence theorem is presented. The method is applicable to the non-singleton lower-level reaction set case. Constraint qualifications which imply the assumptions of the general convergence theorem are given.

Journal ArticleDOI
TL;DR: In this paper, the authors established the convergence of sequential and asynchronous iteration schemes for nonlinear paracontracting operators acting in finite dimensional spaces for linear systems of equations with convex constraints.
Abstract: We establish the convergence of sequential and asynchronous iteration schemes for nonlinear paracontracting operators acting in finite dimensional spaces. Applications to the solution of linear systems of equations with convex constraints are outlined. A first generalization of one of our convergence results to an infinite pool of asymptotically paracontracting operators is also presented.


Journal ArticleDOI
TL;DR: These methods resemble the well-known family of damped Newton and Gauss-Newton methods for solving systems of smooth equations and generalize some recent Newton-like methods for solve B-differentiable equations which arise from various mathematical programs.
Abstract: This paper presents some globally convergent descent methods for solving systems of nonlinear equations defined by locally Lipschitzian functions. These methods resemble the well-known family of damped Newton and Gauss-Newton methods for solving systems of smooth equations; they generalize some recent Newton-like methods for solving B-differentiable equations which arise from various mathematical programs.

Journal ArticleDOI
TL;DR: In this article, the authors present a diffusion discretization of the diffusion equation that can be used to accelerate transport iterations when the transport equation is spatially differenced by a discontinuous finite element (DFE) method.
Abstract: The authors present a discretization of the diffusion equation that can be used to accelerate transport iterations when the transport equation is spatially differenced by a discontinuous finite element (DFE) method. That is, they present a prescription for diffusion synthetic acceleration of DFE transport iterations. (The well-known linear discontinuous and bilinear discontinuous schemes are examples of DFE transport differencings.) They demonstrate that the diffusion discretization can be obtained in any coordinate system on any grid. They show that the diffusion discretization is not strictly consistent with the transport discretization in the usual sense. Nevertheless, they find that it yields a scheme with unconditional stability and rapid convergence. Further, they find that as the optical thickness of spatial cells becomes large, the spectral radius of the iteration scheme approaches zero (i.e., instant convergence). They give analysis results for one- and two-dimensional Cartesian geometries and numerical results for one-dimensional Cartesian and spherical geometries.

Journal ArticleDOI
TL;DR: In this article, the authors consider the following global optimization problems for a univariate Lipschitz function defined on an interval: Problem P: find a globally optimal value off and a corresponding point; Problem Q: localize all globally optimal points.
Abstract: We consider the following global optimization problems for a univariate Lipschitz functionf defined on an interval [a, b]: Problem P: find a globally optimal value off and a corresponding point; Problem Pź: find a globallyź-optimal value off and a corresponding point; Problem Q: localize all globally optimal points; Problem Qź: find a set of disjoint subintervals of small length whose union contains all globally optimal points; Problem Qź: find a set of disjoint subintervals containing only points with a globallyź-optimal value and whose union contains all globally optimal points. We present necessary conditions onf for finite convergence in Problem P and Problem Q, recall the concepts necessary for a worst-case and an empirical study of algorithms (i.e., those ofpassive and ofbest possible algorithms), summarize and discuss algorithms of Evtushenko, Piyavskii-Shubert, Timonov, Schoen, Galperin, Shen and Zhu, presenting them in a simplified and uniform way, in a high-level computer language. We address in particular the problems of using an approximation for the Lipschitz constant, reducing as much as possible the expected length of the region of indeterminacy which contains all globally optimal points and avoiding remaining subintervals without points with a globallyź-optimal value. New algorithms for Problems Pź and Qź and an extensive computational comparison of algorithms are presented in a companion paper.

Journal ArticleDOI
T. Nakata1, Norio Takahashi1, Koji Fujiwara1, N. Okamoto1, K. Muramatsu 
TL;DR: In this paper, a modified Newton-Raphson method was proposed to overcome the divergence of the original NN iteration in the nonlinear magnetic field analysis, a relaxation factor was introduced and its optimum value was examined.
Abstract: In order to overcome the divergence of the Newton-Raphson iteration in the nonlinear magnetic field analysis, a relaxation factor is introduced and its optimum value is examined. It is shown that the modified Newton-Raphson method proposed exhibits quick and successful convergence even in the case when the conventional Newton-Raphson method fails in convergence. >

Journal ArticleDOI
TL;DR: The key theorem shows that μ -geometric ergodicity is equivalent to weak μ-geometric recurrence, and the latter condition is verified for the time-discretised two-centre open Jackson network.
Abstract: This paper gives an overview of recurrence and ergodicity properties of a Markov chain. Two new notions for ergodicity and recurrence are introduced. They are called p-geometric ergodicity and p-geometric recurrence respectively. The first condition generalises geometric as well as strong ergodicity. Our key theorem shows that p-geometric ergodicity is equivalent to weak p-geometric recurrence. The latter condition is verified for the time-discretised two-centre open Jackson network. Hence, the corresponding two-dimensional Markov chain is p-geometrically and geometrically ergodic, but not strongly ergodic. A consequence of A-geometric ergodicity with A of product-form is the convergence of the Laplace-Stieltjes transforms of the marginal distributions. Consequently all moments converge. GEOMETRIC; STRONG AND p-GEOMETRIC ERGODICITY; FOSTER, POPOV AND DOEBLIN


Journal ArticleDOI
TL;DR: Two new algorithms and associated neuron-like network architectures are proposed for solving the eigenvalue problem in real-time by employing a multilayer neural network with linear artificial neurons and it exploits the continuous-time error back-propagation learning algorithm.
Abstract: Two new algorithms and associated neuron-like network architectures are proposed for solving the eigenvalue problem in real-time. The first approach is based on the solution of a set of nonlinear algebraic equations by employing optimization techniques. The second approach employs a multilayer neural network with linear artificial neurons and it exploits the continuous-time error back-propagation learning algorithm. The second approach enables us to find all the eigenvalues and the associated eigenvectors simultaneously by training the network to match some desired patterns, while the first approach is suitable to find during one run only one particular eigenvalue (e.g. an extreme eigenvalue) and the corresponding eigenvector in realtime. In order to find all eigenpairs the optimization process must be repeated in this case many times for different initial conditions. The performance and convergence behaviour of the proposed neural network architectures are investigated by extensive computer simulations.

Journal ArticleDOI
TL;DR: In this paper, the reconstruction of a Sturm-Liouville potential from finite spectral data is considered and a numerical technique based on a shooting method determines a potential with the given spectral data.
Abstract: The reconstruction of a Sturm–Liouville potential from finite spectral data is considered. A numerical technique based on a shooting method determines a potential with the given spectral data. Convergence of reconstructed potentials is shown and numerical examples are considered.


Journal ArticleDOI
01 Jan 1992-Tellus A
TL;DR: In this article, variational assimilation with the adjoint model technique is applied to the Lorenz model to illustrate how the performance of quadri-dimensional data assimilation can vary from one case to another.
Abstract: Quadri-dimensional data assimilation aims at extracting all information from observations distributed over a finite time interval. In this paper, variational assimilation with the adjoint model technique is applied to the Lorenz model to illustrate how the performance of quadri-dimensional data assimilation can vary from one case to another. Observations are generated for two situations, one (the regular case) being more predictable than the other (the case with transition). An examination of the functional being minimized shows that although the regular case does not reveal any significant secondary minimum, there are in the case with transition for which the point of convergence was seen to be highly dependent on the first guess. It was also observed that to pick the first guess on the underlying attractor of this dynamical system does not insure convergence to the true minimum. In the adjoint model technique, the gradient of the functional is obtained through a time integration of the adjoint model using the difference between the solution of the direct model and the observations. It is shown how to relate the observational error covariance matrix to the gradient error covariance matrix. This method is applicable to any model once its adjoint is available and can be used to provide an estimate of the accuracy of the final analysis. Applying it to the Lorenz model, it is shown that due to the different local error growth rates, the same observational error can lead to very different accuracies for the gradient vector. DOI: 10.1034/j.1600-0870.1992.00002.x

Journal ArticleDOI
TL;DR: In this paper, a convergence proof of Adomian's method adapted to non-linear partial differential equations is presented, and the convergence proof is proved numerically using real examples.
Abstract: The study of the convergence of Adomian's method presents some difficulties when applied to real problems. Proposes a convergence proof of this technique adapted to non‐linear partial differential equations. Solves some real examples numerically.

Journal ArticleDOI
TL;DR: An adaptive filtering algorithm is introduced which employs a quasi-Newton approach to give rapid convergence even with colored inputs and appears to be quite robust in finite-precision implementations.
Abstract: The convergence rate of an adaptive system is closely related to its ability to track a time-varying optimum. Basic adaptive filtering algorithms give poor convergence performance when the input to the adaptive system is colored. More sophisticated algorithms which converge very rapidly regardless of the input spectrum algorithms typically require O(N/sup 2/) computation, where N is the order of the adaptive filter, a significant disadvantage for real-time applications. Also, many of these algorithms behave poorly in finite-precision implementation. An adaptive filtering algorithm is introduced which employs a quasi-Newton approach to give rapid convergence even with colored inputs. The algorithm achieves an overall computational requirement of O(N) and appears to be quite robust in finite-precision implementations. >

Proceedings ArticleDOI
16 Dec 1992
TL;DR: In this paper, a convergence theory for iterative learning control based on the use of high-gain current trial feedback for the special case of relative degree one, MIMO (multiple-input multiple-output) minimum-phase systems is presented.
Abstract: The author presents a convergence theory for iterative learning control based on the use of high-gain current trial feedback for the special case of relative degree one, MIMO (multiple-input multiple-output) minimum-phase systems. The results are related to those of Padieu and Su (1990) via the notion of positive real systems. In particular, positive real systems are easily arranged to have convergent learning by simple proportional learning rules of arbitrary positive gain. >

Journal ArticleDOI
TL;DR: In this article, an adaptive implementation of the internal model principle for linear time-invariant systems that allows for rejection of unknown deterministically modeled disturbances is provided, and the global convergence and stability of the algorithm are investigated with unmodeled dynamics.
Abstract: An adaptive implementation of the internal model principle for linear time-invariant systems that allows for rejection of unknown deterministically modeled disturbances is provided. The minimal representation of the system model is used for parameter estimation, and the global convergence and stability of the algorithm are investigated with unmodeled dynamics. With proper modification of the parameter estimation algorithm, global convergence and stability for the algorithm are obtained without the requirement of persistent excitation. Some simulation results are given to support the analysis. >