scispace - formally typeset
Search or ask a question

Showing papers on "Dynamic programming published in 1993"


Journal ArticleDOI
TL;DR: In this article, a genetic-based algorithm was proposed to solve an economic dispatch problem for valve point discontinuities, which utilizes payoff information of candidate solutions to evaluate their optimality.
Abstract: A genetics-based algorithm is proposed to solve an economic dispatch problem for valve point discontinuities. The algorithm utilizes payoff information of candidate solutions to evaluate their optimality. Thus, the constraints of classical LaGrangian techniques on unit curves are circumvented. The formulations of an economic dispatch computer program using genetic algorithms are presented and the program's performances using two different encoding techniques are compared. The results are verified for a sample problem using a dynamic programming technique. >

1,224 citations


Journal ArticleDOI
01 Aug 1993
TL;DR: A rigorous proof of convergence of DP-based learning algorithms is provided by relating them to the powerful techniques of stochastic approximation theory via a new convergence theorem, which establishes a general class of convergent algorithms to which both TD() and Q-learning belong.
Abstract: Recent developments in the area of reinforcement learning have yielded a number of new algorithms for the prediction and control of Markovian environments. These algorithms, including the TD(λ) algorithm of Sutton (1988) and the Q-learning algorithm of Watkins (1989), can be motivated heuristically as approximations to dynamic programming (DP). In this paper we provide a rigorous proof of convergence of these DP-based learning algorithms by relating them to the powerful techniques of stochastic approximation theory via a new convergence theorem. The theorem establishes a general class of convergent algorithms to which both TD(λ) and Q-learning belong.

936 citations


Journal ArticleDOI
01 Nov 1993
TL;DR: An economic dispatch algorithm for the determination of the global or near global optimum dispatch solution is developed based on the simulated annealing technique and transmission losses are first discounted and incorporated in the algorithm through the use of the B-matrix loss formula.
Abstract: This paper develops an economic dispatch algorithm for the determination of the global or near global optimum dispatch solution. The algorithm is based on the simulated annealing technique. In the algorithm, the load balance constraint and the operating limit constraints of the generators are fully accounted for. In the development of the algorithm, transmission losses are first discounted and they are subsequently incorporated in the algorithm through the use of the B-matrix loss formula. The algorithm is demonstrated by its application to a test system. The results determined by the new algorithm are compared to those found by dynamic programming with a zoom feature.

414 citations


Journal ArticleDOI
TL;DR: In this article, the authors consider the problem of individual route guidance in an Intelligent Vehicle-Highway Systems (IVHS) environment, based on time-dependent forecasts of link travel time.

255 citations


Journal ArticleDOI
TL;DR: A new decomposition method for multistage stochastic linear programming problems is proposed and it is shown that for large problems the authors can obtain substantial gains in efficiency with moderate numbers of processors.
Abstract: A new decomposition method for multistage stochastic linear programming problems is proposed. A multistage stochastic problem is represented in a tree-like form and with each node of the decision tree a certain linear or quadratic subproblem is associated. The subproblems generate proposals for their successors and some backward information for their predecessors. The subproblems can be solved in parallel and exchange information in an asynchronous way through special buffers. After a finite time the method either finds an optimal solution to the problem or discovers its inconsistency. An analytical illustrative example shows that parallelization can speed up computation over every sequential method. Computational experiments indicate that for large problems we can obtain substantial gains in efficiency with moderate numbers of processors.

169 citations


Journal ArticleDOI
TL;DR: This paper presents an analysis of the fundamental case in which flights from many origins must be scheduled for arrival at a single, congested airport and describes a set of approaches for addressing a deterministic and a stochastic version of the problem.
Abstract: One of the most important functions of air traffic management systems is the assignment of ground-holding times to flights, i.e., the determination of whether and by how much the take-off of a particular aircraft headed for a congested part of the ATC system should be postponed to reduce the likelihood and extent of airborne delays. In this paper, we will present an analysis of the fundamental case in which flights from many origins must be scheduled for arrival at a single, congested airport. We will describe a set of approaches for addressing a deterministic and a stochastic version of the problem. A minimum cost flow algorithm can be used for the deterministic problem. Under a particular natural assumption regarding the functional form of delay costs, a very efficient, simple algorithm is also available. For the stochastic version, an exact dynamic programming formulation turns out to be impractical for typical instances of the problem and we present a number of heuristic approaches to it. The models a...

158 citations


01 Jan 1993
TL;DR: An algorithm is developed that solves the constant capacities economic lot-sizing problem with concave production costs and linear holding costs in $O(T^3)$ time and improves upon the running time of an earlier algorithm.
Abstract: We develop an algorithm that solves the constant capacities economic lot-sizing problem with concave production costs and linear holding costs in $O(T^3)$ time. The algorithm is based on the standard dynamic programming approach which requires the computation of the minimal costs for all possible subplans of the production plan. Instead of computing these costs in a straightforward manner, we use structural properties of optimal subplans to arrive at a more efficient implementation. Our algorithm improves upon the $O(T^4)$ running time of an earlier algorithm.

151 citations


Journal ArticleDOI
TL;DR: In this paper, a method for scheduling hydrothermal power systems based on the Lagrangian relaxation technique is presented, which is decomposed and converted into a two-level optimization problem.
Abstract: The authors present a method for scheduling hydrothermal power systems based on the Lagrangian relaxation technique. By using Lagrange multipliers to relax system-wide demand and reserve requirements, the problem is decomposed and converted into a two-level optimization problem. Given the sets of Lagrange multipliers, a hydro unit subproblem is solved by a merit order allocation method, and a thermal unit subproblem is solved by using dynamic programming without discretizing generation levels. A subgradient algorithm is used to update the Lagrange multipliers. Numerical results based on Northeast Utilities data show that this algorithm is efficient, and near-optimal solutions are obtained. Compared with previous work where thermal units were scheduled by using the Lagrangian relaxation technique and hydro units by heuristics, the new coordinated hydro and thermal scheduling generates lower total costs and requires less computation time. >

147 citations


Book
01 Aug 1993
TL;DR: Motivation Existence of solutions Variational principles The geometry of nonsmooth analysis Subgradient calculus Necessary conditions in dynamic optimization Dynamic programming.
Abstract: Motivation Existence of solutions (chapter 2) Variational principles The geometry of nonsmooth analysis (chapter 4) Subgradient calculus Necessary conditions in dynamic optimization Dynamic programming.

142 citations


Journal ArticleDOI
TL;DR: Four algorithms, A-D, were developed to align two groups of biological sequences, which are designed to evaluate the cost for a deletion/insertion more accurately when internal gaps are present in either or both groups of sequences.
Abstract: Four algorithms, A-D, were developed to align two groups of biological sequences. Algorithm A is equivalent to the conventional dynamic programming method widely used for aligning ordinary sequences, whereas algorithms B-D are designed to evaluate the cost for a deletion/insertion more accurately when internal gaps are present in either or both groups of sequences. Rigorous optimization of the 'sum of pairs' (SP) score is achieved by algorithm D, whose average performance is close to O(MNL2), where M and N are numbers of sequences included in the two groups and L is the mean length of the sequences. Algorithm B uses some approximations to cope with profile-based operations, whereas algorithm C is a simpler variant of algorithm D. These group-to-group alignment algorithms were applied to multiple sequence alignment with two iterative strategies: a progressive method based on a given binary tree and a randomized grouping--realignment method. The advantages and disadvantages of the four algorithms are discussed on the basis of the results of examinations of several protein families.

142 citations


Journal ArticleDOI
TL;DR: In this paper, a stochastic optimal switching and impulse control problem in a finite horizon is studied, and the continuity of the value function, which is by no means trivial, is proved.
Abstract: A stochastic optimal switching and impulse control problem in a finite horizon is studied. The continuity of the value function, which is by no means trivial, is proved. The Bellman dynamic programming principle is shown to be valid for such a problem. Moroever, the value function is characterized as the unique viscosity solution of the corresponding Hamilton-Jacobi-Bellman equation.

Journal ArticleDOI
TL;DR: The canonical nonlinear programming circuit is shown to be a gradient system that seeks to minimize an unconstrained energy function that can be viewed as a penalty method approximation of the original problem.
Abstract: Deals with the use of neural networks to solve linear and nonlinear programming problems. The dynamics of these networks are analyzed. In particular, the dynamics of the canonical nonlinear programming circuit are analyzed. The circuit is shown to be a gradient system that seeks to minimize an unconstrained energy function that can be viewed as a penalty method approximation of the original problem. Next, the implementations that correspond to the dynamical canonical nonlinear programming circuit are examined. It is shown that the energy function that the system seeks to minimize is different than that of the canonical circuit, due to the saturation limits of op-amps in the circuit. It is also noted that this difference can cause the circuit to converge to a different state than the dynamical canonical circuit. To remedy this problem, a new circuit implementation is proposed. >

Journal ArticleDOI
TL;DR: This work examines shortest path problems in acyclic networks in which arc costs are known functions of certain environment variables at network nodes and develops two recursive procedures for the individual arc case and a dynamic programming procedure that solves the corresponding problem.
Abstract: We examine shortest path problems in acyclic networks in which arc costs are known functions of certain environment variables at network nodes. Each of these variables evolves according to an independent Markov process. The vehicle can wait at a node (at a cost) in anticipation of more favorable arc costs. We first develop two recursive procedures for the individual arc case, one based on successive approximations, and the other on policy iteration. We also solve the same problem via parametric linear programming. We show that the optimal policy essentially classifies the state of the environment variable at a node into two categories: green states for which the optimal action is to immediately traverse the arc, and red states for which the optimal action is to wait. We then extend these concepts for the entire network by developing a dynamic programming procedure that solves the corresponding problem. The complexity of this method is shown to be O(n2K + nK3), where n is the number of network nodes and K ...

Journal ArticleDOI
TL;DR: The heuristic approach for handling the delivery dispatching problem is adopted, based in part on a decomposition of the problem by customer, where customer subproblems generate penalty functions that are applied in a master dispatches problem.
Abstract: We describe a dynamic and stochastic vehicle dispatching problem called the delivery dispatching problem. This problem is modeled as a Markov decision process. Because exact solution of this model is impractical, we adopt a heuristic approach for handling the problem. The heuristic is based in part on a decomposition of the problem by customer, where customer subproblems generate penalty functions that are applied in a master dispatching problem. We describe how to compute bounds on the algorithm's performance, and apply it to several examples with good results.

Journal ArticleDOI
TL;DR: In this paper, a dynamic facilities layout problem is modelled as a modified quadratic assignment problem, where the objective is to minimize total costs: the flow costs over a series of discrete time periods plus the rearrangement costs of changing layouts between time periods.
Abstract: In a dynamic facilities layout problem, the objective is to minimize total costs: the flow costs over a series of discrete time periods plus the rearrangement costs of changing layouts between time periods. By assuming unit department sizes, the problem is modelled as a modified quadratic assignment problem. Five algorithms are modified to include the dynamic aspects. A cutting plane algorithm found the best solutions to a series of realistic test problems, outperforming exchange, branch and bound, dynamic programming and cut tree algorithms. It was able to solve a 30-location 5-time-period problem in 200 CPU seconds.

Proceedings Article
29 Nov 1993
TL;DR: This work uses second order local trajectory optimization to generate locally optimal plans and local models of the value function and its derivatives, and maintains global consistency of the local Models of thevalue function, guaranteeing that the locally optimal Plans are actually globally optimal.
Abstract: Dynamic programming provides a methodology to develop planners and controllers for nonlinear systems. However, general dynamic programming is computationally intractable. We have developed procedures that allow more complex planning and control problems to be solved. We use second order local trajectory optimization to generate locally optimal plans and local models of the value function and its derivatives. We maintain global consistency of the local models of the value function, guaranteeing that our locally optimal plans are actually globally optimal, up to the resolution of our search procedures.

Journal ArticleDOI
TL;DR: The Genetic Algorithm techniques are shown to be very effective search procedures for the class of network optimization problem investigated.
Abstract: Two alternative Genetic Algorithm methods for the optimal selection of the layout and connectivity of a dendritic pipe network are presented and compared. Both methods assume that the layout is selected from a directed base graph defining all feasible arcs. The first method uses a conventional binary string to represent the network layout, with the second method using a more efficient integer representation. Comparison with an exact Dynamic Programming formulation is made. The Genetic Algorithm techniques are shown to be very effective search procedures for the class of network optimization problem investigated.

Proceedings ArticleDOI
02 May 1993
TL;DR: An approach to the path planning of redundant manipulators is presented, posed as a finite-time nonlinear control problem which can be solved by a Newton-Raphson type algorithm.
Abstract: This paper presents a new approach to path planning for redundant manipulators. The path planning problem is posed as a finite time nonlinear control problem which can be solved by a Newton-Raphson type algorithm together with an exterior penalty function method. This technique is capable of handling various goal task definitions as well as incorporating both joint and task space constraints. The algorithm has shown promising results in planning joint path sequences to meet Cartesian goal planning and path following. In contrast to local approaches, this algorithm is less prone to problems such as singularities and local minima. Applications to planar 3R and 4R arms, cooperating 3R arms and a spatial 9 DOF arm are included. >

Proceedings Article
24 Aug 1993
TL;DR: Experimental results show that, while dynamic programming produces the be& plans, simple heuristics often do nearly as well as dynamic programming, and the advantages of bushy execution trees over more restricted tree shapes are highlighted.
Abstract: This paper looks at the problem of multi-join query optimization for symmetric multiproceasore. Optimizrtlion algorithms based on dynamic programming and greedy heuristics are described that, unlike traditional methods, include memory resources and pipelining in their cost model. An analytical model is presented and used to compare the quality of plans produced by each optimization algorithm. Experimental results show that, while dynamic programming produces the be& plans, simple heuristics often do nearly as well. The came results are also used to highlight the advantages of bushy execution trees over more restricted tree shapes.


Journal ArticleDOI
TL;DR: In this paper, the authors present optimization models for waste load allocation from multiple point sources which include both parameter (Type II) and model (Type I) uncertainty, and explore the effects of Type I uncertainty on control decisions.
Abstract: This paper presents optimization models for waste load allocation from multiple point sources which include both parameter (Type II) and model (Type I) uncertainty. These optimization models employ more sophisticated water quality simulation models, for example, in the case of dissolved oxygen modeling, QUAL2E and WASP4, than is typically the norm in studies on the optimization of waste load allocation. Variability in selected input parameters to the water quality simulation models gives rise to stochastic dynamic programming approaches. Two types of reliability and feasibility attributes are highlighted, associated with the management options that are generated. Several dissolved oxygen simulation models are incorporated into the optimization procedures to explore the effects of Type I uncertainty on control decisions. Information from simultaneous consideration of multiple simulation models is aggregated in the dynamic programming framework through two regret-based formulations. By accommodating both model and parameter uncertainty in the modeling framework, trade-offs can be generated between the two so as to assess their influence on control decisions. The models are applied to a waste load allocation problem for the Schuylkill River in Pennsylvania.

Journal ArticleDOI
TL;DR: In this paper, a hybrid preference order dynamic programming/branch-and-bound algorithm was proposed to solve the stochastic linear knapsack problem, in which costs are known with certainty but returns are independent, normally distributed random variables.
Abstract: We consider the stochastic linear knapsack problem in which costs are known with certainty but returns are independent, normally distributed random variables. The objective is to maximize the probability that the overall return equals or exceeds a specified target value. A previously proposed preference order dynamic programming-based algorithm has been shown to be potentially suboptimal. We offer an alternative hybrid DP/branch-and-bound algorithm that both guarantees optimality and significantly outperforms generating the set of Pareto optimal returns.© 1993 John Wiley & Sons, Inc.

Journal ArticleDOI
Rein Luus1
TL;DR: Through the use of the penalty function approach, iterative dynamic programming is used to solve a fed-batch reactor optimization problem where the system is described by four ordinary differential equations and is subjected to an algebraic inequality constraint.

Proceedings ArticleDOI
TL;DR: An improved method based on energy minimizing active contours or `snakes' is used to outline objects on images in an interactive environment using a two-stage algorithm and gives the user the possibility of incremental contour tracking, thus providing feedback on the refinement process.
Abstract: The purpose of our work is to outline objects on images in an interactive environment. We use an improved method based on energy minimizing active contours or `snakes.' Kass et al., proposed a variational technique; Amini used dynamic programming; and Williams and Shah introduced a fast, greedy algorithm. We combine the advantages of the latter two methods in a two-stage algorithm. The first stage is a greedy procedure that provides fast initial convergence. It is enhanced with a cost term that extends over a large number of points to avoid oscillations. The second stage, when accuracy becomes important, uses dynamic programming. This step is accelerated by the use of alternating search neighborhoods and by dropping stable points from the iterations. We have also added several features for user interaction. First, the user can define points of high confidence. Mathematically, this results in an extra cost term and, in that way, the robustness in difficult areas (e.g., noisy edges, sharp corners) is improved. We also give the user the possibility of incremental contour tracking, thus providing feedback on the refinement process. The algorithm has been tested on numerous photographic clip art images and extensive tests on medical images are in progress.© (1993) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.

Proceedings ArticleDOI
Arun N. Swami1, Balakrishna R. Iyer1
19 Apr 1993
TL;DR: The AB algorithm, which combines randomization and neighborhood search with the IK-KBZ algorithm, is presented, which is much more generally applicable, has polynomial time and space complexity, and produces near optimal plans in the space of outer linear join trees.
Abstract: The dynamic programming algorithm for query optimization has exponential complexity. An alternative polynomial time algorithm, the IK-KBZ algorithm, is severely limited in the queries it can optimize. Other algorithms have been proposed, including the greedy algorithm, iterative improvement, and simulated annealing. The AB algorithm, which combines randomization and neighborhood search with the IK-KBZ algorithm, is presented. The AB algorithm is much more generally applicable than IK-KBZ, has polynomial time and space complexity, and produces near optimal plans in the space of outer linear join trees. On average, it does better than the other algorithms that do not do an exhaustive search like dynamic programming. >

Journal ArticleDOI
TL;DR: The temperature parallel algorithm of simulated annealing is considered to be the most suitable for finding the optimal multiple sequence alignment because the algorithm does not require any scheduling for optimization.
Abstract: We have developed simulated annealing algorithms to solve the problem of multiple sequence alignment. The algorithm was shown to give the optimal solution as confirmed by the rigorous dynamic programming algorithm for three-sequence alignment. To overcome long execution times for simulated annealing, we utilized a parallel computer. A sequential algorithm, a simple parallel algorithm and the temperature parallel algorithm were tested on a problem. The results were compared with the result obtained by a conventional tree-based algorithm where alignments were merged by two-way dynamic programming. Every annealing algorithm produced a better energy value than the conventional algorithm. The best energy value, which probably represents the optimal solution, was reached within a reasonable time by both of the parallel annealing algorithms. We consider the temperature parallel algorithm of simulated annealing to be the most suitable for finding the optimal multiple sequence alignment because the algorithm does not require any scheduling for optimization. The algorithm is also useful for refining multiple alignments obtained by other heuristic methods.

Journal ArticleDOI
TL;DR: In this paper, iterative dynamic programming is extended to provide piecewise linear continuous control policies to systems described by sets of ordinary differential equations, and only a single grid point is used for the state at each time stage.
Abstract: Iterative dynamic programming is extended to provide piecewise linear continuous control policies to systems described by sets of ordinary differential equations. To ensure continuity in the optimal control policy, only a single grid point is used for the state at each time stage. Three examples (optimal control of a continuous stirred tank reactor (CSTR), nondifferentiable system, fed-batch fermenter) that are used to test the viability of the procedure show that the proposed procedure is attractive from the computational point of view and provides reliable results even for highly nonlinear systems that exhibit singular control

Journal ArticleDOI
TL;DR: In this paper, some verification theorems are presented within the framework of viscosity solutions under mild assumptions, which have wider applicability than the classical verification theorem for optimal control.

Journal ArticleDOI
TL;DR: A guided search algorithm uses bounds on alignment costs to find all optimal cyclic shifts and corresponding optimal alignment cost for strings representing cyclic patterns.
Abstract: String alignment by dynamic programming is generalized to include cyclic shift and corresponding optimal alignment cost for strings representing cyclic patterns. A guided search algorithm uses bounds on alignment costs to find all optimal cyclic shifts. The bounds are derived from submatrices of an initial dynamic programming matrix. Algorithmic complexity is analyzed for major stages in the search. The applicability of the method is illustrated with satellite DNA sequences and circularly permuted protein sequences. >

Journal ArticleDOI
TL;DR: This article studies thetabu search method in an application for solving an important class of scheduling problems and confirms not only the effectiveness but also the robustness of the TS method, in terms of the solution quality obtained with a common set of parameter choices for two related but different problems.
Abstract: In this article we study thetabu search (TS) method in an application for solving an important class of scheduling problems. Tabu search is characterized by integrating artificial intelligence and optimization principles, with particular emphasis on exploiting flexible memory structures, to yield a highly effective solution procedure. We first discuss the problem of minimizing the sum of the setup costs and linear delay penalties when N jobs, arriving at time zero, are to be scheduled for sequential processing on a continuously available machine. A prototype TS method is developed for this problem using the common approach of exchanging the position of two jobs to transform one schedule into another. A more powerful method is then developed that employs insert moves in combination with swap moves to search the solution space. This method and the best parameters found for it during the preliminary experimentation with the prototype procedure are used to obtain solutions to a more complex problem that considers setup times in addition to setup costs. In this case, our procedure succeeded in finding optimal solutions to all problems for which these solutions are known and a better solution to a larger problem for which optimizing procedures exceeded a specified time limit (branch and bound) or reached a memory overflow (branch and bound/dynamic programming) before normal termination. These experiments confirm not only the effectiveness but also the robustness of the TS method, in terms of the solution quality obtained with a common set of parameter choices for two related but different problems.