scispace - formally typeset
Search or ask a question

Showing papers on "Bellman equation published in 1973"


Journal ArticleDOI
TL;DR: In this paper, a new approach is presented for the problem of stochastic control of nonlinear systems, which takes into account the past observations and also the future observation program.
Abstract: A new approach is presented for the problem of stochastic control of nonlinear systems. It is well known that, except for the linear-quadratic problem, the optimal stochastic controller cannot be obtained in practice. In general it is the curse of dimensionality that makes the strict application of the principle of optimality infeasible. The two subproblems of stochastic control, estimation and control proper, are, except for the linear-quadratic case, intercoupled. As pointed out by Feldbaum, in addition to its effects on the state of the system, the control also affects the estimation performance. In this paper, the control problem is formulated such that this dual property of the control appears explicitly. The resulting control sequence exhibits the closed-loop property, i.e., it takes into account the past observations and also the future observation program. Thus, in addition to being adaptive, this control also plans its future learning according to the control objective. Some preliminary simulation results illustrate these properties of the control.

172 citations


Journal ArticleDOI
TL;DR: In this article, a new approach to the question of what technical conditions of regularity should be imposed that will not only work, but will also be flexible and general enough to meet the diverse applications is presented.
Abstract: Existence theorems are proved for basic problems of Lagrange in the calculus of variations and optimal control theory, in particular problems for arcs with both endpoints fixed. Emphasis is placed on deriving continuity and growth properties of the minimum value of the integral as a function of the endpoints of the arc and the interval of integration. Control regions are not required to be bounded. Some results are also obtained for problems of Bolza. Conjugate convex functions and duality are used extensively in the development, but the problems themselves are not assumed to be especially "convex". Constraints are incorporated by the device of allowing the Lagrangian function to be extended-real-valued. This necessitates a new approach to the question of what technical conditions of regularity should be imposed that will not only work, but will also be flexible and general enough to meet thediverse applications. One of the underlying purposes of the paper is to present an answer to this question. 1. Statement of main results. Let [a, b] be a real interval, and let L be a function on [a, b] x R' x R' with values in (no, + no]. For each subinterval [t0, t1l C [a, b] and endpoint pair (c0, c1) E R' x R', we consider the problem of Lagrange in which the integral

25 citations



Journal ArticleDOI
TL;DR: Using the fact that Poisson impulse noise tends to a Gaussian process under certain limiting conditions, a method which achieves an arbitrarily good approximate solution to the stochastic control problem is given.
Abstract: Bellman's dynamic programming equation for the optimal index and control law for stochastic control problems is a parabolic or elliptic partial differential equation frequently defined in an unbounded domain. Existing methods of solution require bounded domain approximations, the application of singular perturbation techniques or Monte Carlo simulation procedures. In this paper, using the fact that Poisson impulse noise tends to a Gaussian process under certain limiting conditions, a method which achieves an arbitrarily good approximate solution to the stochastic control problem is given. The method uses the two iterative techniques of successive approximation and quasi-linearization and is inherently more efficient than existing methods of solution.

9 citations


Journal ArticleDOI
TL;DR: In this paper, an optimal control problem with linear dynamics and quadratic criterion is imbedded in a family of problems characterized by both initial and terminal points, and a recursive formula for the coefficients of these functions is developed.
Abstract: An optimal control problem with linear dynamics and quadratic criterion is imbedded in a family of problems characterized by both initial and terminal points. The optimal value function is jointly quadratic in initial and terminal points, and the optimal control is jointly linear. Recursive formulas for the coefficients of these functions are developed. This generalized procedure can be used to solve several versions of the problem not solvable by the standard one-ended imbedding technique. In particular, a procedure doubling the number of stages at each iteration is given for problems with time-invariant coefficients.

3 citations


01 Jan 1973
TL;DR: In this article, a new approach is presented for the problem of stochastic control of nonlinear systems, which takes into account the past observations and also the future observation pro- gram.
Abstract: A new approach is presented for the problem of sto- chastic control of nonlinear systems. It is well known that, except for the linear-quadratic problem, the optimal stochastic controller cannot be obtained in practice. In general it is the curse of dimen- sionality that makes the strict application of the principle of optimal- ity infeasible. The two subproblems of stochastic control, estimation and control proper, are, except for the linear-quadratic case, inter- coupled. As pointed out by Feldbaum, in addition to its effects on the state of the system, the control also affects the estimation per- formance. In this paper, the control problem is formulated such that this dual property of the control appears explicitly. The resulting control sequence exhibits the closed-loop property, i.e., it takes into account the past observations and also the future observation pro- gram. Thus, in addition to being adaptive, this control also plans its future learning according to the control objective. Some preliminary simulation results illustrate these properties of the control. I" I. IhTRODUCTIOK MANY cont.ro1 problems t.he uncertaint.ies regarding the plant and t.he measurements can be modeled as stochastic processes. The procedure for obtaining the optimal (closed-loop) stochastic control is application of the principle of opt,imality, which leads to a st,ochastic dynamic programming equation (l), (5). In the literature, one approximation method used for stochastic. control is linearization of the plant about the det,erministic optimal trajectory and application of the well-known separation theorem to the resulting perturbation equations. However, this may not give good performance if the system is ver: nonlinear and the noise level is high because, with the 1inea.rization approach, the control action is corrected only after it has been discovered that the trajectory has deviated from the nominal. But, in fact, if it is known that a dist.urbance nil1 occur in the future, the control should be modified before as well as aft>er the dist.urbance occurs in order to minimize its effects. Therefore, if linearization is to be used, some nominal t.rajectory other than the de- terministic optimal tra.jectory should be used. Some aut,hors (lo), (21), (34) have considered t.he problem of choosing a nominal pat,h to ninimize a certain cost cri- terion obtained by using second-order perturbation analysis along the nominal pat,h. The advantage of these ap- proaches is the simplicity of the resulting cont.ro1 lan-; the main drawback is the validity of assuming t.hat the

2 citations


Journal ArticleDOI
TL;DR: In this article, the authors considered the problem of the optimal control of the terminal state of a linear system containing random perturbations in the form of Gaussian white noise and proposed a method for the approximate solution of Bellman's equation for one class of such systems in the case when the solution of the deterministic Bellman equation has discontinuities of the first kind in its values or in the values of its derivatives.