scispace - formally typeset
Search or ask a question

Showing papers on "Dynamic programming published in 2013"


Posted Content
TL;DR: This work proposes and examines a new value iteration algorithm for MDPs that uses algebraic decision diagrams (ADDs) to represent value functions and policies, assuming an ADD input representation of the MDP.
Abstract: Markov decisions processes (MDPs) are becoming increasing popular as models of decision theoretic planning. While traditional dynamic programming methods perform well for problems with small state spaces, structured methods are needed for large problems. We propose and examine a value iteration algorithm for MDPs that uses algebraic decision diagrams(ADDs) to represent value functions and policies. An MDP is represented using Bayesian networks and ADDs and dynamic programming is applied directly to these ADDs. We demonstrate our method on large MDPs (up to 63 million states) and show that significant gains can be had when compared to tree-structured representations (with up to a thirty-fold reduction in the number of nodes required to represent optimal value functions).

416 citations


Journal ArticleDOI
TL;DR: A general model of decentralized stochastic control called partial history sharing information structure is presented and the optimal control problem at the coordinator is shown to be a partially observable Markov decision process (POMDP) which is solved using techniques fromMarkov decision theory.
Abstract: A general model of decentralized stochastic control called partial history sharing information structure is presented. In this model, at each step the controllers share part of their observation and control history with each other. This general model subsumes several existing models of information sharing as special cases. Based on the information commonly known to all the controllers, the decentralized problem is reformulated as an equivalent centralized problem from the perspective of a coordinator. The coordinator knows the common information and selects prescriptions that map each controller's local information to its control actions. The optimal control problem at the coordinator is shown to be a partially observable Markov decision process (POMDP) which is solved using techniques from Markov decision theory. This approach provides 1) structural results for optimal strategies and 2) a dynamic program for obtaining optimal strategies for all controllers in the original decentralized problem. Thus, this approach unifies the various ad-hoc approaches taken in the literature. In addition, the structural results on optimal control strategies obtained by the proposed approach cannot be obtained by the existing generic approach (the person-by-person approach) for obtaining structural results in decentralized problems; and the dynamic program obtained by the proposed approach is simpler than that obtained by the existing generic approach (the designer's approach) for obtaining dynamic programs in decentralized problems.

319 citations


Journal ArticleDOI
TL;DR: A new iterative adaptive dynamic programming (ADP) algorithm is developed to solve optimal control problems for infinite-horizon discrete-time nonlinear systems with finite approximation errors and it is shown that the iterative performance index functions can converge to a finite neighborhood of the greatest lower bound of all performance indexes.
Abstract: In this paper, a new iterative adaptive dynamic programming (ADP) algorithm is developed to solve optimal control problems for infinite-horizon discrete-time nonlinear systems with finite approximation errors. The idea is to use an iterative ADP algorithm to obtain the iterative control law that makes the iterative performance index function reach the optimum. When the iterative control law and the iterative performance index function in each iteration cannot be accurately obtained, the convergence conditions of the iterative ADP algorithm are obtained. When convergence conditions are satisfied, it is shown that the iterative performance index functions can converge to a finite neighborhood of the greatest lower bound of all performance index functions under some mild assumptions. Neural networks are used to approximate the performance index function and compute the optimal control policy, respectively, for facilitating the implementation of the iterative ADP algorithm. Finally, two simulation examples are given to illustrate the performance of the present method.

263 citations


Journal ArticleDOI
TL;DR: This paper proposes a distance-based train trajectory searching model, upon which three optimization algorithms are applied to search for the optimum train speed trajectory, found that the ant colony optimization (ACO) algorithm obtains better balance between stability and the quality of the results, in comparison with the genetic algorithm (GA).
Abstract: An energy-efficient train trajectory describing the motion of a single train can be used as an input to a driver guidance system or to an automatic train control system. The solution for the best trajectory is subject to certain operational, geographic, and physical constraints. There are two types of strategies commonly applied to obtain the energy-efficient trajectory. One is to allow the train to coast, thus using its available time margin to save energy. The other one is to control the speed dynamically while maintaining the required journey time. This paper proposes a distance-based train trajectory searching model, upon which three optimization algorithms are applied to search for the optimum train speed trajectory. Instead of searching for a detailed complicated control input for the train traction system, this model tries to obtain the speed level at each preset position along the journey. Three commonly adopted algorithms are extensively studied in a comparative style. It is found that the ant colony optimization (ACO) algorithm obtains better balance between stability and the quality of the results, in comparison with the genetic algorithm (GA). For offline applications, the additional computational effort required by dynamic programming (DP) is outweighed by the quality of the solution. It is recommended that multiple algorithms should be used to identify the optimum single-train trajectory and to improve the robustness of searched results.

262 citations


Posted Content
TL;DR: The Fast Marching Tree algorithm (FMT*) as mentioned in this paper is a sampling-based motion planning algorithm for high-dimensional configuration spaces that is proven to be asymptotically optimal and converges to an optimal solution faster than its state-of-the-art counterparts.
Abstract: In this paper we present a novel probabilistic sampling-based motion planning algorithm called the Fast Marching Tree algorithm (FMT*). The algorithm is specifically aimed at solving complex motion planning problems in high-dimensional configuration spaces. This algorithm is proven to be asymptotically optimal and is shown to converge to an optimal solution faster than its state-of-the-art counterparts, chiefly PRM* and RRT*. The FMT* algorithm performs a "lazy" dynamic programming recursion on a predetermined number of probabilistically-drawn samples to grow a tree of paths, which moves steadily outward in cost-to-arrive space. As a departure from previous analysis approaches that are based on the notion of almost sure convergence, the FMT* algorithm is analyzed under the notion of convergence in probability: the extra mathematical flexibility of this approach allows for convergence rate bounds--the first in the field of optimal sampling-based motion planning. Specifically, for a certain selection of tuning parameters and configuration spaces, we obtain a convergence rate bound of order $O(n^{-1/d+\rho})$, where $n$ is the number of sampled points, $d$ is the dimension of the configuration space, and $\rho$ is an arbitrarily small constant. We go on to demonstrate asymptotic optimality for a number of variations on FMT*, namely when the configuration space is sampled non-uniformly, when the cost is not arc length, and when connections are made based on the number of nearest neighbors instead of a fixed connection radius. Numerical experiments over a range of dimensions and obstacle configurations confirm our theoretical and heuristic arguments by showing that FMT*, for a given execution time, returns substantially better solutions than either PRM* or RRT*, especially in high-dimensional configuration spaces and in scenarios where collision-checking is expensive.

254 citations


Journal ArticleDOI
TL;DR: Risk neutral and risk averse approaches to multistage (linear) stochastic programming problems based on the Stochastic Dual Dynamic Programming (SDDP) method are discussed.

235 citations


Journal ArticleDOI
22 Apr 2013-Energies
TL;DR: In this article, the authors compared two optimal energy management methods for parallel hybrid electric vehicles using an Automatic Manual Transmission (AMT) and applied Dynamic Programming and Pontryagin's Minimum Principle (PMP) to obtain the optimal solutions.
Abstract: This paper compares two optimal energy management methods for parallel hybrid electric vehicles using an Automatic Manual Transmission (AMT). A control-oriented model of the powertrain and vehicle dynamics is built first. The energy management is formulated as a typical optimal control problem to trade off the fuel consumption and gear shifting frequency under admissible constraints. The Dynamic Programming (DP) and Pontryagin’s Minimum Principle (PMP) are applied to obtain the optimal solutions. Tuning with the appropriate co-states, the PMP solution is found to be very close to that from DP. The solution for the gear shifting in PMP has an algebraic expression associated with the vehicular velocity and can be implemented more efficiently in the control algorithm. The computation time of PMP is significantly less than DP.

209 citations


Journal ArticleDOI
TL;DR: In this paper, the authors considered a distributed estimation problem with an energy harvesting sensor and a remote estimator, where the sensor observes the state of a discrete-time source which may be a finite state Markov chain or a multidimensional linear Gaussian system.
Abstract: We consider a remote estimation problem with an energy harvesting sensor and a remote estimator. The sensor observes the state of a discrete-time source which may be a finite state Markov chain or a multidimensional linear Gaussian system. It harvests energy from its environment (say, for example, through a solar cell) and uses this energy for the purpose of communicating with the estimator. Due to randomness of the energy available for communication, the sensor may not be able to communicate all of the time. The sensor may also want to save its energy for future communications. The estimator relies on messages communicated by the sensor to produce real-time estimates of the source state. We consider the problem of finding a communication scheduling strategy for the sensor and an estimation strategy for the estimator that jointly minimizes the expected sum of communication and distortion costs over a finite time horizon. Our goal of joint optimization leads to a decentralized decision-making problem. By viewing the problem from the estimator's perspective, we obtain a dynamic programming characterization for the decentralized decision-making problem that involves optimization over functions. Under some symmetry assumptions on the source statistics and the distortion metric, we show that an optimal communication strategy is described by easily computable thresholds and that the optimal estimate is a simple function of the most recently received sensor observation.

185 citations


Journal ArticleDOI
TL;DR: This paper focuses on the decentralized optimal control algorithm for distribution management system by considering distribution network as coupled microgrids, and a coordinated dynamic programming algorithm is used to solve the problem by introducing a look-head dual multiplier mechanism as decentralized control signals from centralized information.
Abstract: This paper focuses on the decentralized optimal control algorithm for distribution management system by considering distribution network as coupled microgrids. Based on the autonomous control model of each single microgrid, coordinated information and strategies among different microgrids are used to decrease the operating cost of distributed generation, improve the efficiency of the distributed storages utilization, and reduce the complexity of distribution network operation. The optimal control problem of microgrids is modeled as a decentralized partially-observable Markov decision process (DEC-POMDP), and a coordinated dynamic programming algorithm is used to solve the problem by introducing a look-head dual multiplier mechanism as decentralized control signals from centralized information. The performances of this algorithm and the impacts of different coordinated information are discussed by some study cases in the end of this paper.

184 citations


Journal ArticleDOI
TL;DR: In this article, a pseudospectral method is used to solve the problem of train optimal control under constraints and fixed arrival time, where the objective function is a trade-off between the energy consumption and the riding comfort.
Abstract: The optimal trajectory planning problem for train operations under constraints and fixed arrival time is considered. The varying line resistance, variable speed restrictions, and varying maximum traction force are included in the problem definition. The objective function is a trade-off between the energy consumption and the riding comfort. Two approaches are proposed to solve this optimal control problem. First, we propose to use the pseudospectral method, a state-of-the-art method for optimal control problems, which has not used for train optimal control before. In the pseudospectral method, the optimal trajectory planning problem is recast into a multiple-phase optimal control problem, which is then transformed into a nonlinear programming problem. However, the calculation time for the pseudospectral method is too long for the real-time application in an automatic train operation system. To shorten the computation time, the optimal trajectory planning problem is reformulated as a mixed-integer linear programming (MILP) problem by approximating the nonlinear terms in the problem by piecewise affine functions. The MILP problem can be solved efficiently by existing solvers that guarantee to return the global optimum for the proposed MILP problem. Simulation results comparing the pseudospectral method, the new MILP approach, and a discrete dynamic programming approach show that the pseudospectral method has the best control performance, but that if the required computation time is also take into consideration, the MILP approach yields the best overall performance. More specifically, for the given case study the control performance of the pseudospectral approach is about 10% better than that of the MILP approach, and the computation time of the MILP approach is two to three orders of magnitude smaller than that of the pseudospectral method and the discrete dynamic programming approach.

183 citations


Journal ArticleDOI
TL;DR: In this paper, a decision maker is responsible to dynamically collect observations so as to enhance his information about an underlying phenomena of interest in a speedy manner while accounting for the penalty of wrong declaration.
Abstract: Consider a decision maker who is responsible to dynamically collect observations so as to enhance his information about an underlying phenomena of interest in a speedy manner while accounting for the penalty of wrong declaration. Due to the sequential nature of the problem, the decision maker relies on his current information state to adaptively select the most “informative” sensing action among the available ones. In this paper, using results in dynamic programming, lower bounds for the optimal total cost are established. The lower bounds characterize the fundamental limits on the maximum achievable information acquisition rate and the optimal reliability. Moreover, upper bounds are obtained via an analysis of two heuristic policies for dynamic selection of actions. It is shown that the first proposed heuristic achieves asymptotic optimality, where the notion of asymptotic optimality, due to Chernoff, implies that the relative difference between the total cost achieved by the proposed policy and the optimal total cost approaches zero as the penalty of wrong declaration (hence the number of collected samples) increases. The second heuristic is shown to achieve asymptotic optimality only in a limited setting such as the problem of a noisy dynamic search. However, by considering the dependency on the number of hypotheses, under a technical condition, this second heuristic is shown to achieve a nonzero information acquisition rate, establishing a lower bound for the maximum achievable rate and error exponent. In the case of a noisy dynamic search with size-independent noise, the obtained nonzero rate and error exponent are shown to be maximum.

Journal ArticleDOI
TL;DR: This paper addresses the robust vehicle routing problem with time windows by proposing two new formulations for the robust problem, each based on a different robust approach, and develops a new cutting plane technique for robust combinatorial optimization problems with complicated constraints.

Journal ArticleDOI
TL;DR: A novel procedure for multi-frame detection in radar systems that combines a track-before-detect (TBD) processor, which jointly elaborates observations from multiple scans (or frames) and confirms reliable plots.
Abstract: In this paper we present a novel procedure for multi-frame detection in radar systems. The proposed architecture consists of a pre-processing stage, which extracts a set of candidate alarms (or plots) from the raw data measurements (e.g., this can be the Detector and Plot-Extractor of common radar systems), and a track-before-detect (TBD) processor, which jointly elaborates observations from multiple scans (or frames) and confirms reliable plots. A computationally efficient dynamic programming algorithm for the TBD processor is derived, which does not require a discretization of the state space and operates directly on the input plot-lists. Finally, a simple algorithm to solve possible data association problems arising at the track-formation step is given, and a thorough complexity and performance analysis is provided, showing that large detection gains with respect to the standard radar processing are achievable with negligible complexity increase.

Journal ArticleDOI
TL;DR: The proposed method allows the use of a substantially lower level of discretization while achieving the same accuracy, and the evaluation time was reduced by a factor of about 300, while the accuracy of the solution was maintained.
Abstract: Many optimal control problems include a continuous nonlinear dynamic system, state, and control constraints, and final state constraints. When using dynamic programming to solve such a problem, the solution space typically needs to be discretized and interpolation is used to evaluate the cost-to-go function between the grid points. When implementing such an algorithm, it is important to treat numerical issues appropriately. Otherwise, the accuracy of the found solution will deteriorate and global optimality can be restored only by increasing the level of discretization. Unfortunately, this will also increase the computational effort needed to calculate the solution. A known problem is the treatment of states in the time-state space from which the final state constraint cannot be met within the given final time. In this brief, a novel method to handle this problem is presented. The new method guarantees global optimality of the found solution, while it is not restricted to a specific class of problems. Opposed to that, previously proposed methods either sacrifice global optimality or are applicable to a specific class of problems only. Compared to the basic implementation, the proposed method allows the use of a substantially lower level of discretization while achieving the same accuracy. As an example, an academic optimal control problem is analyzed. With the new method, the evaluation time was reduced by a factor of about 300, while the accuracy of the solution was maintained.

Journal ArticleDOI
TL;DR: This paper presents a novel method of global adaptive dynamic programming (ADP) for the adaptive optimal control of nonlinear polynomial systems by relaxing the problem of solving the Hamilton-Jacobi-Bellman equation to an optimization problem, which is solved via a new policy iteration method.
Abstract: This paper presents a novel method of global adaptive dynamic programming (ADP) for the adaptive optimal control of nonlinear polynomial systems. The strategy consists of relaxing the problem of solving the Hamilton-Jacobi-Bellman (HJB) equation to an optimization problem, which is solved via a new policy iteration method. The proposed method distinguishes from previously known nonlinear ADP methods in that the neural network approximation is avoided, giving rise to significant computational improvement. Instead of semiglobally or locally stabilizing, the resultant control policy is globally stabilizing for a general class of nonlinear polynomial systems. Furthermore, in the absence of the a priori knowledge of the system dynamics, an online learning method is devised to implement the proposed policy iteration technique by generalizing the current ADP theory. Finally, three numerical examples are provided to validate the effectiveness of the proposed method.

Journal ArticleDOI
TL;DR: A greedy heuristic dynamic programming iteration algorithm is developed to solve the zero-sum game problems, which can be used to solves the Hamilton-Jacobi-Isaacs equation associated with H"~ optimal regulation control problems.

Journal ArticleDOI
TL;DR: The results indicate that the proposed new approach to optimize operations of hydro storage systems with multiple connected reservoirs is tractable for a real-world application and that the gap between theoretical upper and a simulated lower bound decreases sufficiently fast.
Abstract: We propose a new approach to optimize operations of hydro storage systems with multiple connected reservoirs whose operators participate in wholesale electricity markets. Our formulation integrates short-term intraday with long-term interday decisions. The intraday problem considers bidding decisions as well as storage operation during the day and is formulated as a stochastic program. The interday problem is modeled as a Markov decision process of managing storage operation over time, for which we propose integrating stochastic dual dynamic programming with approximate dynamic programming. We show that the approximate solution converges towards an upper bound of the optimal solution. To demonstrate the efficiency of the solution approach, we fit an econometric model to actual price and in inflow data and apply the approach to a case study of an existing hydro storage system. Our results indicate that the approach is tractable for a real-world application and that the gap between theoretical upper and a simulated lower bound decreases sufficiently fast. (authors' abstract)

Journal ArticleDOI
TL;DR: A general computational approach based on dynamic programming is derived that can be shown to converge to an optimal policy by computing an inner approximation to future cost functions and an outer approximation that delivers a lower bound.
Abstract: We consider a class of multistage stochastic linear programs in which at each stage a coherent risk measure of future costs is to be minimized. A general computational approach based on dynamic programming is derived that can be shown to converge to an optimal policy. By computing an inner approximation to future cost functions, we can evaluate an upper bound on the cost of an optimal policy, and an outer approximation delivers a lower bound. The approach we describe is particularly useful in sampling-based algorithms, and a numerical example is provided to show the efficacy of the methodology when used in conjunction with stochastic dual dynamic programming.

Journal ArticleDOI
TL;DR: By factorizing the joint posterior density using the structure of MTT, an efficient DP-TBD algorithm is developed to approximately solve the joint maximization in a fast but accurate manner and can accurately estimate the number of targets and reliably track multiple targets even when targets are in proximity.
Abstract: This paper considers the multi-target tracking (MTT) problem through the use of dynamic programming based track-before-detect (DP-TBD) methods. The usual solution of this problem is to adopt a multi-target state, which is the concatenation of individual target states, then search the estimate in the expanded multi-target state space. However, this solution involves a high-dimensional joint maximization which is computationally intractable for most realistic problems. Additionally, the dimension of the multi-target state has to be determined before implementing the DP search. This is problematic when the number of targets is unknown. We make two contributions towards addressing these problems. Firstly, by factorizing the joint posterior density using the structure of MTT, an efficient DP-TBD algorithm is developed to approximately solve the joint maximization in a fast but accurate manner. Secondly, we propose a novel detection procedure such that the dimension of the multi-target state no longer needs be to pre-determined before the DP search. Our analysis indicates that the proposed algorithm could achieve a computational complexity which is almost linear to the number of processed frames and independent of the number of targets. Simulation results show that this algorithm can accurately estimate the number of targets and reliably track multiple targets even when targets are in proximity.

Journal ArticleDOI
TL;DR: The adaptive dynamic programming approach is employed for designing an optimal controller of unknown discrete-time nonlinear systems with control constraints and a neural network is constructed for identifying the unknown dynamical system with stability proof.

Journal ArticleDOI
TL;DR: Experimental results indicate that CDDE_Ar can enjoy a statistically superior performance on a wide range of DOPs in comparison to some of the best known dynamic evolutionary optimizers.
Abstract: This paper presents a Cluster-based Dynamic Differential Evolution with external Ar chive (CDDE_Ar) for global optimization in dynamic fitness landscape The algorithm uses a multipopulation method where the entire population is partitioned into several clusters according to the spatial locations of the trial solutions The clusters are evolved separately using a standard differential evolution algorithm The number of clusters is an adaptive parameter, and its value is updated after a certain number of iterations Accordingly, the total population is redistributed into a new number of clusters In this way, a certain sharing of information occurs periodically during the optimization process The performance of CDDE_Ar is compared with six state-of-the-art dynamic optimizers over the moving peaks benchmark problems and dynamic optimization problem (DOP) benchmarks generated with the generalized-dynamic-benchmark-generator system for the competition and special session on dynamic optimization held under the 2009 IEEE Congress on Evolutionary Computation Experimental results indicate that CDDE_Ar can enjoy a statistically superior performance on a wide range of DOPs in comparison to some of the best known dynamic evolutionary optimizers

Book
02 Dec 2013
TL;DR: This article describes algorithms in a unified framework, giving pseudocode together with memory and iteration complexity analysis for each, and empirical evaluations of these techniques with four representations across four domains provide insight into how these algorithms perform with various feature sets in terms of running time and performance.
Abstract: A Markov Decision Process (MDP) is a natural framework for formulating sequential decision-making problems under uncertainty. In recent years, researchers have greatly advanced algorithms for learning and acting in MDPs. This article reviews such algorithms, beginning with well-known dynamic programming methods for solving MDPs such as policy iteration and value iteration, then describes approximate dynamic programming methods such as trajectory based value iteration, and finally moves to reinforcement learning methods such as Q-Learning, SARSA, and least-squares policy iteration. We describe algorithms in a unified framework, giving pseudocode together with memory and iteration complexity analysis for each. Empirical evaluations of these techniques with four representations across four domains, provide insight into how these algorithms perform with various feature sets in terms of running time and performance.

Journal ArticleDOI
TL;DR: A dynamic programming algorithm for the one-dimensional Fused Lasso Signal Approximator (FLSA) has a linear running time in the worst case, and simulations indicate substantial performance improvement over existing algorithms.
Abstract: We propose a dynamic programming algorithm for the one-dimensional Fused Lasso Signal Approximator (FLSA). The proposed algorithm has a linear running time in the worst case. A similar approach is developed for the task of least squares segmentation, and simulations indicate substantial performance improvement over existing algorithms. Examples of R and C implementations are provided in the online Supplementary materials, posted on the journal web site.

Journal ArticleDOI
TL;DR: A mixed integer programming (MIP) model is constructed first to solve the lot-sizing problem with multiple suppliers, multiple periods and quantity discounts and an efficient Genetic Algorithm (GA) is proposed next to tackle the problem when it becomes too complicated.

Journal ArticleDOI
TL;DR: This paper proposes methods to generate security strategies to achieve the maximal overall security strength while meeting the real-time constraint and proposes an optimal algorithm, Integer Linear Programming Security Optimization (ILP-SOP), based on the SEAT graph approach.

Journal ArticleDOI
TL;DR: This brief presents a novel framework of robust adaptive dynamic programming (robust-ADP) aimed at computing globally stabilizing and suboptimal control policies in the presence of dynamic uncertainties.
Abstract: This brief presents a novel framework of robust adaptive dynamic programming (robust-ADP) aimed at computing globally stabilizing and suboptimal control policies in the presence of dynamic uncertainties. A key strategy is to integrate ADP theory with techniques in modern nonlinear control with a unique objective of filling up a gap in the past literature of ADP without taking into account dynamic uncertainties. Neither the system dynamics nor the system order are required to be precisely known. As an illustrative example, the computational algorithm is applied to the controller design of a two-machine power system.

Journal ArticleDOI
TL;DR: In this article, a dynamic programming based planning algorithm for optimization of the feeder routes and branch conductor sizes is proposed, and a set of Pareto solutions is obtained using a weighted aggregation of the two objectives with different weight settings.

Journal ArticleDOI
TL;DR: This paper gives a review of ADP in the order of the variation on the structure of ADp scheme, the development ofADP algorithms and applications, aiming to bring the reader into this novel field of optimization technology.

Journal ArticleDOI
TL;DR: In this paper, an extended reduced dynamic programming algorithm (RDPA) was proposed to solve the problem of optimal operation scheduling of a pumping station with multiple pumps, where both the energy cost and maintenance cost were considered in the performance function of the optimization problem.

Proceedings ArticleDOI
01 Oct 2013
TL;DR: A multi-stage dynamic programming tool that uses a recursive trajectory generation that is similar to least-cost path-finding algorithms that optimizes the upstream profile while comparing discretized downstream cases is suggested.
Abstract: Researchers have attempted to compute a fuel-optimal vehicle trajectory by receiving traffic signal phasing and timing information. This problem, however, is complex when microscopic models are used to compute the objective function. This paper suggests use of a multi-stage dynamic programming tool that not only provides outputs that are closer to optimum, but are also computationally much faster. It uses a recursive trajectory generation that is similar to least-cost path-finding algorithms that optimizes the upstream profile while comparing discretized downstream cases. Since dynamic programming is faster than traditional computational methods, the algorithm can afford to use microscopic models and thereby be sensitive to a multitude of inputs such as grade, weather etc. Agent-based simulations suggest fuel savings in the range of 19 percent and travel-time savings of 32 percent in the vicinity of intersections. This research also showed potential benefits to vehicles following a vehicle that uses the proposed logic.