scispace - formally typeset
Search or ask a question

Showing papers on "Dynamic programming published in 1973"


Journal ArticleDOI
TL;DR: In this article, necessary and sufficient conditions for optimality are derived for systems described by stochastic differential equations with control based on partial observations, and the solution of the system is defined in a way which permits a very wide class of admissible controls.
Abstract: In this paper necessary and sufficient conditions for optimality are derived for systems described by stochastic differential equations with control based on partial observations. The solution of the system is defined in a way which permits a very wide class of admissible controls, and then Hamilton–Jacobi criteria for optimality are derived from a version of Bellman’s “principle of optimality.”The method of solution is based on a result of Girsanov : Wiener measure is transformed for each admissible control to the measure appropriate to a solution of the system equation. The optimality criteria are derived for three kinds of information pattern : partial observations (control based on the past of only certain components of the state), complete observations, and “Markov” observations (observation of the current state). Markov controls are shown to be minimizing in the class of those based on complete observations for system models of a suitable type.Finally, similar methods are applied to two-person zero-...

196 citations


Journal ArticleDOI
TL;DR: This paper is concerned with the application of a new method to the problem of transmission planning expressed as a large finite Markovian sequential process over time.
Abstract: This paper is concerned with the application of a new method to the problem of transmission planning expressed as a large finite Markovian sequential process over time. The method recognizes that in solving large sequential problems the complete enumeration of strategies is seldom feasible. The basic idea of Discrete Dynamic Optimizing (DDO) is to combine the deterministic search procedure of dynamic programming with a probabilistic search and a heuristic stopping criterion. The method is designed to take advantage of whatever information is known about the problem. Test results obtained with an experimental program are also given.

169 citations


Journal ArticleDOI
TL;DR: In this article, the authors present two algorithms for computing optimal lot sizes in multi-stage assembly systems with known time-varying demand, where each facility may have any number of predecessors but at most a single successor.
Abstract: A multi-stage assembly system is a special case of Veinott's general multi-facility system in that each facility may have any number of predecessors but at most a single successor. This paper presents two algorithms for computing optimal lot sizes in such systems with known time-varying demand. The first is a dynamic programming algorithm for which solution time increases exponentially with the number of time periods, but only linearly with the number of stages, irrespective of assembly structure. The second is a branch and bound algorithm intended for cases where the number of time periods is large but the structure is close to serial. Computational results are given and extensions considered.

148 citations


Journal ArticleDOI
TL;DR: This paper shall highlight some of the salient theoretical developments in the specific area of algorithms for optimal control aimed at unconstrained problems and derived by using first and second variation methods of the calculus of variations.
Abstract: Review of some of the salient theoretical developments in the specific area of optimal control algorithms. The first algorithms for optimal control were aimed at unconstrained problems and were derived by using first- and second-variation methods of the calculus of variations. These methods have subsequently been recognized as gradient, Newton-Raphson, or Gauss-Newton methods in function space. A much more recent addition to the arsenal of unconstrained optimal control algorithms are several variations of conjugate-gradient methods. At first, constrained optimal control problems could only be solved by exterior penalty function methods. Later algorithms specifically designed for constrained problems have appeared. Among these are methods for solving the unconstrained linear quadratic regulator problem, as well as certain constrained minimum-time and minimum-energy problems. Differential-dynamic programming was developed from dynamic programming considerations. The conditional-gradient method, the gradient-projection method, and a couple of feasible directions methods were obtained as extensions or adaptations of related algorithms for finite-dimensional problems. Finally, the so-called epsilon-methods combine the Ritz method with penalty function techniques.

136 citations


Proceedings Article
20 Aug 1973
TL;DR: A top-down and a bottom-up method are proposed for searching additive And/OR graphs, respectively, extensions of the "arrow" method proposed by Nilsson for searching AND/OR trees and Dijkstra's algorithm for finding the shortest path.
Abstract: Additive AND/OR graphs are defined as AND/ OR graphs without circuits, which can be considered as folded AND/OR trees; i. e. the cost of a common subproblem is added to the cost as many times as the subproblem occurs, but it is computed only once. Additive AND/OR graphs are naturally obtained by reinterpreting the dynamic programming method in the light of the problem-reduction approach. An example of this reduction is given. A top-down and a bottom-up method are proposed for searching additive AND/OR graphs. These methods are, respectively, extensions of the "arrow" method proposed by Nilsson for searching AND/OR trees and Dijkstra's algorithm for finding the shortest path. A proof is given that the two methods find an optimal solution whenever a solution exists.

133 citations


Journal ArticleDOI
TL;DR: The idea of a sufficiently informative function, which parallels the notion of a sufficient statistic of stochastic optimal control, is introduced, and conditions under which the optimal controller decomposes into an estimator and an actuator are identified.
Abstract: The problem of optimal feedback control of uncertain discrete-time dynamic systems is considered where the uncertain quantities do not have a stochastic description but instead are known to belong to given sets The problem is converted to a sequential minimax problem and dynamic programming is suggested as a general method for its solution The notion of a sufficiently informative function, which parallels the notion of a sufficient statistic of stochastic optimal control, is introduced, and conditions under which the optimal controller decomposes into an estimator and an actuator are identified A limited class of problems for which this decomposition simplifies the computation and implementation of the optimal controller is delineated

105 citations


Journal ArticleDOI
TL;DR: A two-stage algorithm that obtains a sufficient partition suboptimally, either by methods suggested in the paper or developed elsewhere, and optimizes the results of the first stage through a dynamic programming approach is proposed.
Abstract: The efficient partitioning of a finite-dimensional space by a decision tree, each node of which corresponds to a comparison involving a single variable, is a problem occurring in pattern classification, piecewise-constant approximation, and in the efficient programming of decision trees A two-stage algorithm is proposed The first stage obtains a sufficient partition suboptimally, either by methods suggested in the paper or developed elsewhere; the second stage optimizes the results of the first stage through a dynamic programming approach In pattern classification, the resulting decision rule yields the minimum average number of calculations to reach a decision In approximation, arbitrary accuracy for a finite number of unique samples is possible In programming decision trees, the expected number of computations to reach a decision is minimized

77 citations


01 Jan 1973
TL;DR: The emphasis of this research is on the development of mathematical programming tools for the design of S/F communication networks with proper design criteria established and expressed in terms of the variables.
Abstract: : The emphasis of this research is on the development of mathematical programming tools for the design of S/F communication networks. An analytical model for the system is first presented and discussed. The design variables (routing of messages, channel capacities, topology, etc.) are then defined and proper design criteria (delay, cost, thruput, etc.) are established and expressed in terms of the variables. Next, various design problems are defined and investigated; the most significant of them here follow: (1) Find the minimum cost channel capacity assignment, given the routing of the messages and the maximum admissible delay T; (2) Find the routing which minimizes the delay, given the channel capacities (and therefore the cost); (3) Find the routing and capacities assignment which minimizes the cost, given the maximum admissible delay T. (4) Find the topology, routing and capacities assignment which minimizes the cost, given the maximum admissible delay T. (Author Modified Abstract)

68 citations


Journal ArticleDOI
TL;DR: In this paper, a method is developed to determine the optimal design of any system of reservoirs with series and parallel connections, where the original problem is decomposed by successive approximations into a series of subproblems in such a way that the sequence of optimizations over the sub-problems converges back to the original one.
Abstract: A method is developed to determine the optimal design of any system of reservoirs with series and parallel connections. The original problem is decomposed by successive approximations into a series of subproblems in such a way that the sequence of optimizations over the subproblems converges back to the original one. The optimal design is obtained by maximizing the net benefits. Incremental dynamic programming is used to find the optimal operating policy. The theories developed are applied to a real system, i.e., the Eel River Ultimate Project in northern California.

66 citations


Journal ArticleDOI
TL;DR: In this paper, a dynamic programming formulation for finding the sequencing of a finite set of expansion projects that meets a deterministic demand projection at minimum discounted cost is presented. And conditions for decomposition or direct solution of the sequencing problem are derived.
Abstract: This paper examines the problem of finding the sequencing of a finite set of expansion projects that meets a deterministic demand projection at minimum discounted cost. For the particular situation defined, the tuning for a next expansion may be determined directly from knowledge of projected demand and present capacity. A dynamic programming formulation for sequencing projects is developed, and solution of an example problem demonstrates that other methods proposed for this problem are not, in general, valid. Extensions of the dynamic programming formulation to include interdependence between projects and joint selection of project scale and sequencing are indicated, and conditions for decomposition or direct solution of the sequencing problem are derived.

61 citations


Journal ArticleDOI
TL;DR: New results in the theory of non-serial dynamic programming are described and their computational relevance is pointed out.

Journal ArticleDOI
TL;DR: The saving in computation and improvement in accuracy that can result from the use of this algorithm can be quite significant for chain products of large arrays and in iterative solutions of matrix equations involving chain products.
Abstract: It is pointed out that the number of scalar multiplications (additions) required to evaluate a matrix chain product depends on the sequence in which the associative law of matrix multiplication is applied. An algorithm is developed to find the optimum sequence that minimizes the number of scalar multiplications. A program is written for use on the CDC 6600 computer to implement this algorithm and also to carry out the chain product according to the optimum sequence. Several examples are included to illustrate the algorithm. The saving in computation and improvement in accuracy that can result from the use of this algorithm can be quite significant for chain products of large arrays and in iterative solutions of matrix equations involving chain products.

Journal ArticleDOI
TL;DR: The computational theory of dynamic programming is examined from the viewpoint of parallel computation and parallel aspects of various dimensionality reduction techniques such as state increment dynamic programming, successive approximations, and shift vectors are given.
Abstract: The computational theory of dynamic programming is examined from the viewpoint of parallel computation. A discussion of various forms of parallelism, the corresponding parallel algorithms, the applicability of the algorithms to different types of optimization problems, and their advantages over serial computation is presented. In addition, parallel aspects of various dimensionality reduction techniques such as state increment dynamic programming, successive approximations, and shift vectors are also given.

Journal ArticleDOI
TL;DR: A dynamic programming model with a physical equation and a stochastic recursive equation is developed to find the optimum operational policy of a single multipurpose surface reservoir.
Abstract: The main objective of this paper is to present a stockastic dynamic programming model useful in determining the optimal operating policy of a single multipurpose surface reservoir. It is the unreliability of forecasting the amount of future streamflow which makes the problem of a reservoir operation a stochastic process. In this paper the stochastic nature of the streamflow is taken into account by considering the correlation between the streamflows of each pair of consecutive time intervals. This interdependence is used to calculate the probability of transition from a given state and stage to its succeeding ones. A dynamic programming model with a physical equation and a stochastic recursive equation is developed to find the optimum operational policy. For illustrative purposes, the model is applied to a real surface water reservoir system.

Journal ArticleDOI
TL;DR: Three subclasses of r-msdp (recursive monotone sequential decision process) are introduced in this paper and it turns out that some of them are solvable while the rest are unsolvable.

Journal ArticleDOI
TL;DR: Optimal control theory is no less important in economics today than welfare economics was twenty years ago, and it is used to evaluate different growth paths; and it forms a crucial part of the long run theory of the firm as discussed by the authors.
Abstract: Optimal control theory is no less important in economics today than welfare economics was twenty years ago. It is used to evaluate different growth paths; and it forms a crucial part of the long-run theory of the firm. It has been used to derive an aggregate savings function, a demand for real balances function and an optimum dividend policy for a corporation. There is no area in economics which cannot benefit from the use of optimal control theory. As a rule, economists have used the Maximum Principle of Pontryagin to derive optimal control laws. Cass [4] formulated instructions to a central planning board, which attempts to direct the growth of the economy in an optimal manner, on the basis of the maximum principle. In his expository paper, Dorfman [5] used " optimal control theory" and the maximum principle synonymously. The maximum principle is an " open loop " type of optimization method in that it yields an entire sequence of controls to be followed from initial conditions. Half of the initial conditions must be obtained from transversality conditions which imply the solution of differential equations. Given the high probability of errors in measurement, formulation and implementation, as well as the likelihood of outside disturbances, the overall system will not be stable unless the open loop optimal policies are converted to a feedback form. This is to be expected since the optimal path to the desired target is unique. It is clearly advantageous in economics to derive policies in feedback form, where the next move depends upon the current state,

Journal ArticleDOI
TL;DR: The solution of multidimensional sequencing problems encountered in the planning of capacity expansion in large-scale water resources systems by an especially efficient dynamic programming algorithm is presented.
Abstract: The solution of multidimensional sequencing problems encountered in the planning of capacity expansion in large-scale water resources systems by an especially efficient dynamic programming algorithm is presented. The dynamic programming algorithm results couples on the optimality of permutation schedules with the imbedded state space concept in order to effect a drastic reduction of dimensionality making it possible to efficiently solve problems of the dimension encountered in real-world water resource systems. Computational experience with the imbedded state space dynamic programming algorithm on the solution of a number of two- and three-dimensional sequencing problems from several real-world water resources systems is reported. Furthermore, the results of a sensitivity analysis on these data are also presented.

Journal ArticleDOI
TL;DR: The recent development of parallel processing algorithms for solving optimal control problems for nonlinear dynamic systems, both the deterministic and stochastic cases are considered.
Abstract: This paper describes the recent development of parallel processing algorithms for solving optimal control problems for nonlinear dynamic systems. Both the deterministic and stochastic cases are considered. The resulting algorithms are applicable to a large range of parallel architecture computers, including Iliac IV, associative processors, and potential designs based on integrated circuit technology.

Journal ArticleDOI
TL;DR: A dynamic programming solution is presented which decomposes the problem of scheduling jobs on M-parallel processors into a sequencing problem within an allocation problem and the computation required for solution is found to depend on the sequencing problem as it is affected by the waiting cost function.
Abstract: The problem of scheduling jobs on M-parallel processors is one of selecting a set of jobs to be processed from a set of available jobs in order to maximize profit. This problem is examined and a dynamic programming solution is presented which decomposes it into a sequencing problem within an allocation problem. The computation required for solution is found to depend on the sequencing problem as it is affected by the waiting cost function. Various forms of the waiting cost function are considered. The solution procedure is illustrated by an example, and possible extensions of the formulation are discussed.

Journal ArticleDOI
TL;DR: The use of the imbedded state space approach in the determination of optimal solutions in the two cases is discussed and illustrated by the solution of the counterexample as discussed by the authors.
Abstract: Two cases in which a conventional dynamic programing algorithm may produce non-optimal solutions to an important class of sequencing problems encountered in water resource development are identified. The pathological behavior of the conventional dynamic programing algorithm is analyzed and illustrated with a counterexample. The use of the imbedded state space approach in the determination of optimal solutions in the two cases is discussed and illustrated by the solution of the counterexample. The extension of the imbedded state approach to the solution of more general combined selection and sequencing problems is also discussed.

Journal ArticleDOI
TL;DR: It is shown that the stack algorithm introduced by Zigangirov and by Jelinek is essentially equivalent to the Fano algorithm with regard to the set of nodes examined and the path selected, although the description, implementation, and action of the two algorithms are quite different.
Abstract: Sequential decoding procedures are studied in the context of selecting a path through a tree. Several algorithms are considered and their properties compared. It is shown that the stack algorithm introduced by Zigangirov and by Jelinek is essentially equivalent to the Fano algorithm with regard to the set of nodes examined and the path selected, although the description, implementation, and action of the two algorithms are quite different. A modified Fano algorithm is introduced, in which the quantizing parameter \Delta is eliminated. It can be inferred from limited simulation results that, at least in some applications, the new algorithm is computationally inferior to the old; however, it is of some theoretical interest since the conventional Fano algorithm may be considered to be a quantized version of it.

Journal ArticleDOI
TL;DR: The problem of selecting which journals to acquire in order to best satisfy library objectives is examined and modeled as a zero-one linear programming problem using an objective function based on expected usage as a measure of journal worth and on cost constraints which account for the scarcity of capital.
Abstract: The problem of selecting which journals to acquire in order to best satisfy library objectives is examined and modeled as a zero-one linear programming problem This is done using an objective function based on expected usage as a measure of journal worth and on cost constraints which account for the scarcity of capital. A dynamic programming algorithm is used to break down the larger problem into smaller sub-problems and to generate a feasible solution. Special cases where the solution is optimal are presented and discussed in terms of their implications for the library. An example problem is presented to illustrate the algorithm.

Journal ArticleDOI
TL;DR: In this paper, a new approach to the numerical solution of optimal control problems with state-variable inequality constraints is presented, where the concept of constraining hyperplanes may be used to approximate the original problem with a problem where the constraints are of a mixed state-control variable type.
Abstract: A new approach to the numerical solution of optimal control problems with state-variable inequality constraints is presented. It is shown that the concept of constraining hyperplanes may be used to approximate the original problem with a problem where the constraints are of a mixed state-control variable type. The efficiency and the accuracy of the combination of constraining hyperplanes and a second-order differential dynamic programming algorithm are investigated on problems of different complexity, and comparisons are made with the slack-variable and the penalty-function techniques.

Journal ArticleDOI
TL;DR: An extension to evaluate the handling of separately defined objectives is suggested, and questions concerning the stability of a solution with respect to minor changes in the algorithms identified are asked.
Abstract: The improvement of the process of design requires the evaluation of alternative design methods. A framework for evaluating graphically based, intuitive design methods is (1) solve a set of problems with a proven algorithm, (2) have a sample of designers solve the same set of problems, and (3) compare the results to identify the success of each designer or design method. A proto-typical experiment using the problem of corridor selection is used to illustrate the approach, with solutions obtained by twelve students being compared to solutions obtained by discrete dynamic programming for a set of three problems. An important distinction is drawn between measuring the success of the solution vs the success of the problem solving procedure. An extension to evaluate the handling of separately defined objectives is suggested, and questions concerning the stability of a solution with respect to minor changes in the algorithms identified.

Journal ArticleDOI
01 Jan 1973-Networks
TL;DR: If after a shortest route is determined, the costs on all arcs incident into or out of a node are modified in any form, at most 0(m2) elementary calculations will determine a new optimal solution, leading to upper bounds of 0(Km3) in both cases.
Abstract: The problem of finding a shortest route in a network with unrestricted costs is approached through solving an assignment problem associated to the network. The upper bound on the number of elementary calculations required for the solution is 0(m3). However, in most cases, the actual number of computations is considerably less and depends on different network characteristics than Dynamic Programming algorithms do. In examples of networks generated stochastically, this number was below 0(m2.5). A parametric analysis is presented. It is shown that if after a shortest route is determined, the costs on all arcs incident into or out of a node are modified in any form, at most 0(m2) elementary calculations will determine a new optimal solution. This feature, shared by Dynamic Programming algorithms only for cases where all cost decrease, can be applied to problems such as the determination of the K-shortest routes and the K-smallest assignments, leading to upper bounds of 0(Km3) in both cases.


Journal ArticleDOI
01 Jan 1973
TL;DR: A trajectory optimization technique for multidimensional nonlinear processes is presented and accounts for inequality constraints on state variables in a straightforward manner and is applied to solve a number of trajectory optimization problems.
Abstract: A trajectory optimization technique for multidimensional nonlinear processes is presented. Problems which are cast in a discrete-time mold are considered. The method is based on dynamic programming and employs a combination of the technique of functional approximation and the method of region-limiting strategies. The cost function at each stage is approximated by a quadratic polynomial in a region which is restricted to be of a size judged appropriate to reduce the error in the approximation. Minimal costs are evaluated at a set of points, called base points. A new control trajectory and an improved state trajectory are then generated within an extrapolation region. The iterative application of this procedure yields an optimal trajectory. Contained in the algorithm is a simple procedure which eliminates matrix inversion to determine the coefficients of the approximating polynomial. The present algorithm is applicable to problems with one bounded control action. It accounts for inequality constraints on state variables in a straightforward manner. The algorithm is applied to solve a number of trajectory optimization problems.

Journal ArticleDOI
TL;DR: In this article, discrete dynamic optimization (DDO) is used to maximize the utility of the transmission network over the period of the planning horizon ensuing adequate service, and the results show the effects of economy of scale as the plan horizon is increased.
Abstract: Power system transmission planning is expressed as a large finite Markovian sequential process over time involving 1) a known planning horizon divided into finite number of stages; 2) a large number of alternative additions (type, size, and place of new facilities) at each stage; 3) analysis and criteria for evaluating network acceptance (performance, reliability, security, cost, etc.) for each alternative at each stage; 4) a searching method to find the optimum plan. Forward dynamic programming is used to maximize the utility of the transmission network over the period of the planning horizon ensuing adequate service, and the results show the effects of economy of scale as the planning horizon is increased. However, dynamic programming is limited by the number of alternatives considered at each stage. Thus a new method, discrete dynamic optimizing (DDO), is also introduced.

Journal ArticleDOI
TL;DR: In this article, a stochastic control problem over an infinite horizon which involves a linear system and a convex cost functional is analyzed, and the convergence of the dynamic programming algorithm associated with the problem is proved.
Abstract: A stochastic control problem over an infinite horizon which involves a linear system and a convex cost functional is analyzed. We prove the convergence of the dynamic programming algorithm associated with the problem, and we show the existence of a stationary Borel measurable optimal control law. The approach used illustrates how results on infinite time reachability [1] can be used for the analysis of dynamic programming algorithms over an infinite horizon subject to state constraints.

Journal ArticleDOI
Takashi Yahagi1
TL;DR: The optimal control of output feedback systems for a quadratic performance index is presented by using a new parameter optimization technique and the proposed method for optimal output feedback control is also applied to sampled-data systems.
Abstract: For a linear control system with quadratic performance index the optimal control takes a feedback form of all state variables. However, if there are some states which are not fed in the control system, it is impossible to obtain the optimal feedback control by using the usual mathematical optimization technique such as dynamic programming or the maximum principle. This paper presents the optimal control of output feedback systems for a quadratic performance index by using a new parameter optimization technique. Since the optimal feedback gains depend on the initial states in the output feedback control system, two cases where (1) the initial states are known, and (2) the statistical properties of initial states such as mean and covariance matrices are known, are considered here. Furthermore, the proposed method for optimal output feedback control is also applied to sampled-data systems.