scispace - formally typeset
Search or ask a question

Showing papers on "Dynamic programming published in 2009"


Journal ArticleDOI
TL;DR: This book provides detailed coverage of modelling decision processes under uncertainty, robustness, designing and estimating value function approximations, choosing effective step-size rules, and convergence issues and is an excellent textbook for advanced undergraduate and beginning graduate students.
Abstract: By Warren B. Powell, Wiley Series in Probability and Statistics, Hoboken, NJ, J. Wiley & Sons, 2007, 488 pp., US$83.60 (hardcover), ISBN 978-0-470-17155-4 Dynamic programming introduced by Bellman ...

930 citations


Journal ArticleDOI
TL;DR: The near-optimal control problem for a class of nonlinear discrete-time systems with control constraints is solved by iterative adaptive dynamic programming algorithm.
Abstract: In this paper, the near-optimal control problem for a class of nonlinear discrete-time systems with control constraints is solved by iterative adaptive dynamic programming algorithm. First, a novel nonquadratic performance functional is introduced to overcome the control constraints, and then an iterative adaptive dynamic programming algorithm is developed to solve the optimal feedback control problem of the original constrained system with convergence analysis. In the present control scheme, there are three neural networks used as parametric structures for facilitating the implementation of the iterative algorithm. Two examples are given to demonstrate the convergence and feasibility of the proposed optimal control scheme.

574 citations


Proceedings ArticleDOI
08 Jul 2009
TL;DR: This paper introduces a generic dynamic programming function for Matlab that solves discretetime optimal-control problems using Bellman's dynamic programming algorithm.
Abstract: This paper introduces a generic dynamic programming function for Matlab. This function solves discretetime optimal-control problems using Bellman's dynamic programming algorithm. The function is implemented such that the user only needs to provide the objective function and the model equations. The function includes several options for solving optimal-control problems. The model equations can include several state variables and input variables. Furthermore, the model equations can be time-variant and include time-variant state and input constraints. The syntax of the function is explained using two examples. The first is the well-known Lotka-Volterra fishery problem and the second is a parallel hybrid-electric vehicle optimization problem.

508 citations


Journal ArticleDOI
TL;DR: This paper proposes a new coevolutionary paradigm that hybridizes competitive and cooperative mechanisms observed in nature to solve multiobjective optimization problems and to track the Pareto front in a dynamic environment.
Abstract: In addition to the need for satisfying several competing objectives, many real-world applications are also dynamic and require the optimization algorithm to track the changing optimum over time. This paper proposes a new coevolutionary paradigm that hybridizes competitive and cooperative mechanisms observed in nature to solve multiobjective optimization problems and to track the Pareto front in a dynamic environment. The main idea of competitive-cooperative coevolution is to allow the decomposition process of the optimization problem to adapt and emerge rather than being hand designed and fixed at the start of the evolutionary optimization process. In particular, each species subpopulation will compete to represent a particular subcomponent of the multiobjective problem, while the eventual winners will cooperate to evolve for better solutions. Through such an iterative process of competition and cooperation, the various subcomponents are optimized by different species subpopulations based on the optimization requirements of that particular time instant, enabling the coevolutionary algorithm to handle both the static and dynamic multiobjective problems. The effectiveness of the competitive-cooperation coevolutionary algorithm (COEA) in static environments is validated against various multiobjective evolutionary algorithms upon different benchmark problems characterized by various difficulties in local optimality, discontinuity, nonconvexity, and high-dimensionality. In addition, extensive studies are also conducted to examine the capability of dynamic COEA (dCOEA) in tracking the Pareto front as it changes with time in dynamic environments.

461 citations


Journal ArticleDOI
TL;DR: This work proposes a more structured formulation that greatly simplifies the construction of optimal control laws in both discrete and continuous domains, and enables computations that were not possible before.
Abstract: Optimal choice of actions is a fundamental problem relevant to fields as diverse as neuroscience, psychology, economics, computer science, and control engineering. Despite this broad relevance the abstract setting is similar: we have an agent choosing actions over time, an uncertain dynamical system whose state is affected by those actions, and a performance criterion that the agent seeks to optimize. Solving problems of this kind remains hard, in part, because of overly generic formulations. Here, we propose a more structured formulation that greatly simplifies the construction of optimal control laws in both discrete and continuous domains. An exhaustive search over actions is avoided and the problem becomes linear. This yields algorithms that outperform Dynamic Programming and Reinforcement Learning, and thereby solve traditional problems more efficiently. Our framework also enables computations that were not possible before: composing optimal control laws by mixing primitives, applying deterministic methods to stochastic systems, quantifying the benefits of error tolerance, and inferring goals from behavioral data via convex optimization. Development of a general class of easily solvable problems tends to accelerate progress—as linear systems theory has done, for example. Our framework may have similar impact in fields where optimal choice of actions is relevant.

455 citations


Journal ArticleDOI
TL;DR: In this article, the authors provide a fully analytical characterization of the optimal dynamic mean-variance portfolios within a general incomplete-market economy, and recover a simple structure that also inherits several conventional properties of static models.
Abstract: Mean-variance criteria remain prevalent in multi-period problems, and yet not much is known about their dynamically optimal policies. We provide a fully analytical characterization of the optimal dynamic mean-variance portfolios within a general incomplete-market economy, and recover a simple structure that also inherits several conventional properties of static models. We also identify a probability measure that incorporates intertemporal hedging demands and facilitates much tractability in the explicit computation of portfolios. We solve the problem by explicitly recognizing the time-inconsistency of the mean-variance criterion and deriving a recursive representation for it, which makes dynamic programming applicable. We further show that our time-consistent solution is generically different from the pre-commitment solutions in the extant literature, which maximize the mean-variance criterion at an initial date and which the investor commits to follow despite incentives to deviate. We illustrate the usefulness of our analysis by explicitly computing dynamic mean-variance portfolios under various stochastic investment opportunities in a straightforward way, which does not involve solving a Hamilton-Jacobi-Bellman differential equation. A calibration exercise shows that the mean-variance hedging demands may comprise a significant fraction of the investor's total risky asset demand.

443 citations


Journal ArticleDOI
TL;DR: A dynamic programming approach to the search for an optimal sensing order with adaptive modulation is presented and it is proved that a simple optimal sensing Order does exist.
Abstract: This paper investigates the optimal sensing order problem in multi-channel cognitive medium access control with opportunistic transmissions. The scenario in which the availability probability of each channel is known is considered first. In this case, when the potential channels are identical (except for the availability probabilities) and independent, it is shown that, although the intuitive sensing order (i.e., descending order of the channel availability probabilities) is optimal when adaptive modulation is not used, it does not lead to optimality in general with adaptive modulation. Thus, a dynamic programming approach to the search for an optimal sensing order with adaptive modulation is presented. For some special cases, it is proved that a simple optimal sensing order does exist. More complex scenarios are then considered, e.g., in which the availability probability of each channel is unknown. Optimal strategies are developed to address the challenges created by this additional uncertainty. Finally, a scheme is developed to address the issue of sensing errors.

333 citations


Journal ArticleDOI
TL;DR: To solve the CDLP for real-size networks, it is proved that the associated column generation subproblem is indeed NP-hard and a simple, greedy heuristic is proposed to overcome the complexity of an exact algorithm.
Abstract: During the past few years, there has been a trend to enrich traditional revenue management models built upon the independent demand paradigm by accounting for customer choice behavior. This extension involves both modeling and computational challenges. One way to describe choice behavior is to assume that each customer belongs to a segment, which is characterized by a consideration set, i.e., a subset of the products provided by the firm that a customer views as options. Customers choose a particular product according to a multinomial-logit criterion, a model widely used in the marketing literature. In this paper, we consider the choice-based, deterministic, linear programming model (CDLP) of Gallego et al. (2004) [Gallego, G., G. Iyengar, R. Phillips, A. Dubey. 2004. Managing flexible products on a network. Technical Report CORC TR-2004-01, Department of Industrial Engineering and Operations Research, Columbia University, New York], and the follow-up dynamic programming decomposition heuristic of van Ryzin and Liu (2008) [van Ryzin, G. J., Q. Liu. 2008. On the choice-based linear programming model for network revenue management. Manufacturing Service Oper. Management10(2) 288--310]. We focus on the more general version of these models, where customers belong to overlapping segments. To solve the CDLP for real-size networks, we need to develop a column generation algorithm. We prove that the associated column generation subproblem is indeed NP-hard and propose a simple, greedy heuristic to overcome the complexity of an exact algorithm. Our computational results show that the heuristic is quite effective and that the overall approach leads to high-quality, practical solutions.

303 citations


Journal ArticleDOI
TL;DR: A general feedback channel coding theorem based on Massey's concept of directed information is proved and the average cost optimality equation (ACOE) can be viewed as an implicit single-letter characterization of the capacity.
Abstract: In this paper, we introduce a general framework for treating channels with memory and feedback. First, we prove a general feedback channel coding theorem based on Massey's concept of directed information. Second, we present coding results for Markov channels. This requires determining appropriate sufficient statistics at the encoder and decoder. We give a recursive characterization of these sufficient statistics. Third, a dynamic programming framework for computing the capacity of Markov channels is presented. Fourth, it is shown that the average cost optimality equation (ACOE) can be viewed as an implicit single-letter characterization of the capacity. Fifth, scenarios with simple sufficient statistics are described. Sixth, error exponents for channels with feedback are presented.

301 citations


Journal ArticleDOI
TL;DR: Results show that Monte Carlo cost-to-go estimation reduces computation time 65% in large instances with little or no loss in solution quality, and compares results to the perfect information case from solving exact a posteriori solutions for sampled vehicle routing problems.

271 citations


Journal ArticleDOI
TL;DR: This article presents a worst-case O(n) 3-time algorithm for the problem when the two trees have size n, and proves the optimality of the algorithm among the family of decomposition strategy algorithms—which also includes the previous fastest algorithms—by tightening the known lower bound.
Abstract: The edit distance between two ordered rooted trees with vertex labels is the minimum cost of transforming one tree into the other by a sequence of elementary operations consisting of deleting and relabeling existing nodes, as well as inserting new nodes. In this article, we present a worst-case O(n3)-time algorithm for the problem when the two trees have size n, improving the previous best O(n3 log n)-time algorithm. Our result requires a novel adaptive strategy for deciding how a dynamic program divides into subproblems, together with a deeper understanding of the previous algorithms for the problem. We prove the optimality of our algorithm among the family of decomposition strategy algorithms—which also includes the previous fastest algorithms—by tightening the known lower bound of Ω(n2 log2n) to Ω(n3), matching our algorithm's running time. Furthermore, we obtain matching upper and lower bounds for decomposition strategy algorithms of Θ(nm2 (1 + log n/m)) when the two trees have sizes m and n and m

Journal ArticleDOI
TL;DR: An anytime algorithm to solve the coalition structure generation problem, which uses a novel representation of the search space, which partitions the space of possible solutions into sub-spaces such that it is possible to compute upper and lower bounds on the values of the best coalition structures in them.
Abstract: Coalition formation is a fundamental type of interaction that involves the creation of coherent groupings of distinct, autonomous, agents in order to efficiently achieve their individual or collective goals. Forming effective coalitions is a major research challenge in the field of multi-agent systems. Central to this endeavour is the problem of determining which of the many possible coalitions to form in order to achieve some goal. This usually requires calculating a value for every possible coalition, known as the coalition value, which indicates how beneficial that coalition would be if it was formed. Once these values are calculated, the agents usually need to find a combination of coalitions, in which every agent belongs to exactly one coalition, and by which the overall outcome of the system is maximized. However, this coalition structure generation problem is extremely challenging due to the number of possible solutions that need to be examined, which grows exponentially with the number of agents involved. To date, therefore, many algorithms have been proposed to solve this problem using different techniques -- ranging from dynamic programming, to integer programming, to stochastic search -- all of which suffer from major limitations relating to execution time, solution quality, and memory requirements. With this in mind, we develop an anytime algorithm to solve the coalition structure generation problem. Specifically, the algorithm uses a novel representation of the search space, which partitions the space of possible solutions into sub-spaces such that it is possible to compute upper and lower bounds on the values of the best coalition structures in them. These bounds are then used to identify the sub-spaces that have no potential of containing the optimal solution so that they can be pruned. The algorithm, then, searches through the remaining sub-spaces very efficiently using a branch-and-bound technique to avoid examining all the solutions within the searched subspace(s). In this setting, we prove that our algorithm enumerates all coalition structures efficiently by avoiding redundant and invalid solutions automatically. Moreover, in order to effectively test our algorithm we develop a new type of input distribution which allows us to generate more reliable benchmarks compared to the input distributions previously used in the field. Given this new distribution, we show that for 27 agents our algorithm is able to find solutions that are optimal in 0.175% of the time required by the fastest available algorithm in the literature. The algorithm is anytime, and if interrupted before it would have normally terminated, it can still provide a solution that is guaranteed to be within a bound from the optimal one. Moreover, the guarantees we provide on the quality of the solution are significantly better than those provided by the previous state of the art algorithms designed for this purpose. For example, for the worst case distribution given 25 agents, our algorithm is able to find a 90% efficient solution in around 10% of time it takes to find the optimal solution.

Journal ArticleDOI
TL;DR: This article introduces Gaussian process dynamic programming (GPDP), an approximate value function-based RL algorithm, and proposes to learn probabilistic models of the a priori unknown transition dynamics and the value functions on the fly.

Journal ArticleDOI
TL;DR: In this paper, a nonlinear interior-point method and discretization penalties are proposed for the solution of mixed-integer nonlinear programming (MINLP) problem associated with reactive power and voltage control in distribution systems to minimize daily energy losses, with time-related constraints being considered.
Abstract: An algorithm based on a nonlinear interior-point method and discretization penalties is proposed in this paper for the solution of the mixed-integer nonlinear programming (MINLP) problem associated with reactive power and voltage control in distribution systems to minimize daily energy losses, with time-related constraints being considered. Some of these constraints represent limits on the number of switching operations of transformer load tap changers (LTCs) and capacitors, which are modeled as discrete control variables. The discrete variables are treated here as continuous variables during the solution process, thus transforming the MINLP problem into an NLP problem that can be more efficiently solved exploiting its highly sparse matrix structure; a strategy is developed to round these variables off to their nearest discrete values, so that daily switching operation limits are properly met. The proposed method is compared with respect to other well-known MINLP solution methods, namely, a genetic algorithm and the popular GAMS MINLP solvers BARON and DICOPT. The effectiveness of the proposed method is demonstrated in the well-known PG&E 69-bus distribution network and a real distribution system in the city of Guangzhou, China, where the proposed technique has been in operation since 2003.

Journal ArticleDOI
TL;DR: This work addressed the problem of developing a model to simulate at a high level of detail the movements of over 6,000 drivers for Schneider National, the largest truckload motor carrier in the United States, and produced accurate estimates of the marginal value of 300 different types of drivers.
Abstract: We addressed the problem of developing a model to simulate at a high level of detail the movements of over 6,000 drivers for Schneider National, the largest truckload motor carrier in the United States. The goal of the model was not to obtain a better solution but rather to closely match a number of operational statistics. In addition to the need to capture a wide range of operational issues, the model had to match the performance of a highly skilled group of dispatchers while also returning the marginal value of drivers domiciled at different locations. These requirements dictated that it was not enough to optimize at each point in time (something that could be easily handled by a simulation model) but also over time. The project required bringing together years of research in approximate dynamic programming, merging math programming with machine learning, to solve dynamic programs with extremely high-dimensional state variables. The result was a model that closely calibrated against real-world operations and produced accurate estimates of the marginal value of 300 different types of drivers.

Journal ArticleDOI
TL;DR: In this article, an improved particle swarm optimization (IPSO) technique is proposed to solve the problem of optimal power generation to short-term hydrothermal scheduling problem, using improved PSO technique, which is applied on a multi-reservoir cascaded hydro-electric system having prohibited operating zones and a thermal unit with valve point loading.

Journal ArticleDOI
TL;DR: This paper investigates how to sequence surgical cases in a day-care facility and applies column generation to solve this combinatorial optimization problem and proposes a dynamic programming algorithm to solve the pricing problem.

Journal ArticleDOI
TL;DR: In this article, an adaptive traffic signal controller for real-time operation is presented, which is built on approximate dynamic programming (ADP) to reduce computational burden by using an approximation to the value function of the dynamic programming.
Abstract: This paper presents a study on an adaptive traffic signal controller for real-time operation. The controller aims for three operational objectives: dynamic allocation of green time, automatic adjustment to control parameters, and fast revision of signal plans. The control algorithm is built on approximate dynamic programming (ADP). This approach substantially reduces computational burden by using an approximation to the value function of the dynamic programming and reinforcement learning to update the approximation. We investigate temporal-difference learning and perturbation learning as specific learning techniques for the ADP approach. We find in computer simulation that the ADP controllers achieve substantial reduction in vehicle delays in comparison with optimised fixed-time plans. Our results show that substantial benefits can be gained by increasing the frequency at which the signal plans are revised, which can be achieved conveniently using the ADP approach.

Journal ArticleDOI
TL;DR: This article provides a brief review of approximate dynamic programming, and how it should be approached from the perspective of different problem classes to make better decisions over time.
Abstract: Approximate dynamic programming (ADP) is a broad umbrella for a modeling and algorithmic strategy for solving problems that are sometimes large and complex, and are usually (but not always) stochastic. It is most often presented as a method for overcoming the classic curse of dimensionality that is well-known to plague the use of Bellman's equation. For many problems, there are actually up to three curses of dimensionality. But the richer message of approximate dynamic programming is learning what to learn, and how to learn it, to make better decisions over time. This article provides a brief review of approximate dynamic programming, without intending to be a complete tutorial. Instead, our goal is to provide a broader perspective of ADP and how it should be approached from the perspective of different problem classes. © 2009 Wiley Periodicals, Inc. Naval Research Logistics 56: 239-249, 2009

Journal ArticleDOI
TL;DR: In this paper, a semi-definite programming (SDP) model for the security-constrained unit commitment (SCUC) problem is described, which is directly solved by the interior-point method for SDP within the polynomial times.
Abstract: Considering the economics and securities for the operation of a power system, a semi-definite programming (SDP) model for the security-constrained unit commitment (SCUC) problem is described here, which is directly solved by the interior-point method for SDP within the polynomial times. The proposed method is promising for the SCUC problems because of its excellent convergence and the ability of handling the non-covex integer variables. No model decomposition and initial relaxation are needed when applying the SDP-based method. When the solution contains minor mismatches in the integer variables, a simple rounding strategy is used to correct the non-integer into integer efficiently. Different test cases from 6 to 118 buses over a 24 h horizon are presented. Extensive numerical simulations have shown that the proposed method is capable of obtaining the optimal UC schedules without any network and bus voltage violations, and minimising the operation cost as well.

01 Jan 2009
TL;DR: A dynamic programming approach for the vehicle routing problem with time windows including the EC social legislation on drivers' driving and working hours is proposed, which includes all optional rules in these legislations, which are generally ignored in the literature.
Abstract: In practice, apart from the problem of vehicle routing, schedulers also face the problem of finding feasible driver schedules complying with complex restrictions on drivers' driving and working hours. To address this complex interdependent problem of vehicle routing and break scheduling, we propose a dynamic programming approach for the vehicle routing problem with time windows including the EC social legislation on drivers' driving and working hours. Our algorithm includes all optional rules in these legislations, which are generally ignored in the literature. To include the legislation in the dynamic programming algorithm we propose a break scheduling method that does not increase the time-complexity of the algorithm. This is a remarkable effect that generally does not hold for local search methods, which have proved to be very successful in solving less restricted vehicle routing problems. Computational results show that our method finds solutions to benchmark instances with 18% less vehicles and 5% less travel distance than state of the art approaches. Furthermore, they show that including all optional rules of the legislation leads to an additional reduction of 4% in the number of vehicles and of 1.5% regarding the travel distance. Therefore, the optional rules should be exploited in practice.

Journal ArticleDOI
TL;DR: In this paper, the authors combine the dynamic programming (DP) solution algorithm with the Bayesian Markov chain Monte Carlo algorithm into a single algorithm that solves the DP problem and estimates the parameters simultaneously.
Abstract: We propose a new methodology for structural estimation of infinite horizon dynamic discrete choice models. We combine the dynamic programming (DP) solution algorithm with the Bayesian Markov chain Monte Carlo algorithm into a single algorithm that solves the DP problem and estimates the parameters simultaneously. As a result, the computational burden of estimating a dynamic model becomes comparable to that of a static model. Another feature of our algorithm is that even though the number of grid points on the state variable is small per solution-estimation iteration, the number of effective grid points increases with the number of estimation iterations. This is how we help ease the “curse of dimensionality.” We simulate and estimate several versions of a simple model of entry and exit to illustrate our methodology. We also prove that under standard conditions, the parameters converge in probability to the true posterior distribution, regardless of the starting values.

Journal ArticleDOI
TL;DR: This work compares different strategies proposed in the literature to guide decremental state space relaxation and proposes a new heuristic technique to initialize the critical vertex set and provides experimental evidence of its effectiveness.

Proceedings ArticleDOI
14 Jun 2009
TL;DR: This paper addresses exact learning of Bayesian network structure from data and expert's knowledge based on score functions that are decomposable by presenting a branch and bound algorithm that integrates parameter and structural constraints with data in a way to guarantee global optimality with respect to the score function.
Abstract: This paper addresses exact learning of Bayesian network structure from data and expert's knowledge based on score functions that are decomposable. First, it describes useful properties that strongly reduce the time and memory costs of many known methods such as hill-climbing, dynamic programming and sampling variable orderings. Secondly, a branch and bound algorithm is presented that integrates parameter and structural constraints with data in a way to guarantee global optimality with respect to the score function. It is an any-time procedure because, if stopped, it provides the best current solution and an estimation about how far it is from the global solution. We show empirically the advantages of the properties and the constraints, and the applicability of the algorithm to large data sets (up to one hundred variables) that cannot be handled by other current methods (limited to around 30 variables).

Journal ArticleDOI
TL;DR: The main idea of the approach relies on the use of several complementary dominance relations to discard partial solutions that cannot lead to new non-dominated criterion vectors to obtain an efficient method that outperforms the existing methods both in terms of CPU time and size of solved instances.

Journal ArticleDOI
TL;DR: A novel method for solving the unit commitment (UC) problem based on quantum-inspired evolutionary algorithm (QEA) to handle the unit-scheduling problem and the Lambda-iteration technique to solve the economic dispatch problem is presented.
Abstract: This paper presents a novel method for solving the unit commitment (UC) problem based on quantum-inspired evolutionary algorithm (QEA). The proposed method applies QEA to handle the unit-scheduling problem and the Lambda-iteration technique to solve the economic dispatch problem. The QEA method is based on the concept and principles of quantum computing, such as quantum bits, quantum gates and superposition of states. QEA employs quantum bit representation, which has better population diversity compared with other representations used in evolutionary algorithms, and uses quantum gate to drive the population towards the best solution. The mechanism of QEA can inherently treat the balance between exploration and exploitation and also achieve better quality of solutions, even with a small population. The proposed method is applied to systems with the number of generating units in the range of 10 to 100 in a 24-hour scheduling horizon and is compared to conventional methods in the literature. Moreover, the proposed method is extended to solve a large-scale UC problem in which 100 units are scheduled over a seven-day horizon with unit ramp-rate limits considered. The application studies have demonstrated the superior performance and feasibility of the proposed algorithm.

Proceedings Article
01 Jan 2009
TL;DR: The need of the partial knowledge of the nonlinear system dynamics is relaxed in the development of a novel approach to ADP using a two part process: online system identification and offline optimal control training.
Abstract: The optimal control of linear systems accompanied by quadratic cost functions can be achieved by solving the well-known Riccati equation. However, the optimal control of nonlinear discrete-time systems is a much more challenging task that often requires solving the nonlinear Hamilton―Jacobi―Bellman (HJB) equation. In the recent literature, discrete-time approximate dynamic programming (ADP) techniques have been widely used to determine the optimal or near optimal control policies for affine nonlinear discrete-time systems. However, an inherent assumption of ADP requires the value of the controlled system one step ahead and at least partial knowledge of the system dynamics to be known. In this work, the need of the partial knowledge of the nonlinear system dynamics is relaxed in the development of a novel approach to ADP using a two part process: online system identification and offline optimal control training. First, in the system identification process, a neural network (NN) is tuned online using novel tuning laws to learn the complete plant dynamics so that a local asymptotic stability of the identification error can be shown. Then, using only the learned NN system model, offline ADP is attempted resulting in a novel optimal control law. The proposed scheme does not require explicit knowledge of the system dynamics as only the learned NN model is needed. The proof of convergence is demonstrated. Simulation results verify theoretical conjecture.

Journal ArticleDOI
TL;DR: This paper employs a new evolutionary algorithm known as bacterial foraging (BF) for solving the unit commitment problem, and this new integer-code algorithm is on the base of foraging behavior of E-coli Bacteria in the human intestine.
Abstract: The unit commitment (UC) problem is one of the most difficult optimization problems in power system, because this problem has many variables and constraints. The objective is the minimization of the total production cost over the scheduling horizon while the constraints must be satisfied, too. This paper employs a new evolutionary algorithm known as bacterial foraging (BF) for solving the UC problem. This new integer-code algorithm is on the base of foraging behavior of E-coli Bacteria in the human intestine. By integer coding of the problem, computation time decreases and the minimum up/down-time constraints may be coded directly, and therefore, there is no need to use penalty functions for these constraints. From simulation results, satisfactory solutions are obtained in comparison with previously reported results.

Journal ArticleDOI
TL;DR: A new derivation of the dynamic programming equation for general stochastic target problems with unbounded controls is provided, together with the appropriate boundary conditions, which are applied to the problem of quantile hedging in financial mathematics.
Abstract: We consider the problem of finding the minimal initial data of a controlled process which guarantees to reach a controlled target with a given probability of success or, more generally, with a given level of expected loss. By suitably increasing the state space and the controls, we show that this problem can be converted into a stochastic target problem, i.e., finding the minimal initial data of a controlled process which guarantees to reach a controlled target with probability one. Unlike in the existing literature on stochastic target problems, our increased controls are valued in an unbounded set. In this paper, we provide a new derivation of the dynamic programming equation for general stochastic target problems with unbounded controls, together with the appropriate boundary conditions. These results are applied to the problem of quantile hedging in financial mathematics and are shown to recover the explicit solution of Follmer and Leukert [Finance Stoch., 3 (1999), pp. 251-273].

Journal ArticleDOI
TL;DR: This paper introduces and studies real-time vehicle rerouting problems with time windows, applicable to delivery and/or pickup services that undergo service disruptions due to vehicle breakdowns, and develops a Lagrangian relaxation based-heuristic which performs very well.