Showing papers on "Dynamic programming published in 2012"

PDF

Open Access

Book•

[...]

David P. Williamson¹, David B. Shmoys¹•Institutions (1)

05 Jun 2012

TL;DR: In this paper, the authors present a survey of the central algorithmic techniques for designing approximation algorithms, including greedy and local search algorithms, dynamic programming, linear and semidefinite programming, and randomization.

...read moreread less

Abstract: Discrete optimization problems are everywhere, from traditional operations research planning problems, such as scheduling, facility location, and network design; to computer science problems in databases; to advertising issues in viral marketing. Yet most such problems are NP-hard. Thus unless P = NP, there are no efficient algorithms to find optimal solutions to such problems. This book shows how to design approximation algorithms: efficient algorithms that find provably near-optimal solutions. The book is organized around central algorithmic techniques for designing approximation algorithms, including greedy and local search algorithms, dynamic programming, linear and semidefinite programming, and randomization. Each chapter in the first part of the book is devoted to a single algorithmic technique, which is then applied to several different problems. The second part revisits the techniques but offers more sophisticated treatments of them. The book also covers methods for proving that optimization problems are hard to approximate. Designed as a textbook for graduate-level algorithms courses, the book will also serve as a reference for researchers interested in the heuristic solution of discrete optimization problems.

...read moreread less

759 citations

Journal Article•DOI•

Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics

[...]

Yu Jiang¹, Zhong-Ping Jiang¹•Institutions (1)

New York University¹

01 Oct 2012-Automatica

TL;DR: This paper presents a novel policy iteration approach for finding online adaptive optimal controllers for continuous-time linear systems with completely unknown system dynamics, using the approximate/adaptive dynamic programming technique to iteratively solve the algebraic Riccati equation using the online information of state and input.

...read moreread less

723 citations

Journal Article•DOI•

Optimal control of unknown nonaffine nonlinear discrete-time systems based on adaptive dynamic programming

[...]

Ding Wang¹, Derong Liu¹, Qinglai Wei¹, Dongbin Zhao¹, Ning Jin² - Show less +1 more•Institutions (2)

Chinese Academy of Sciences¹, University of Illinois at Chicago²

01 Aug 2012-Automatica

TL;DR: An intelligent-optimal control scheme for unknown nonaffine nonlinear discrete-time systems with discount factor in the cost function is developed and implemented via globalized dual heuristic programming technique.

...read moreread less

360 citations

Book•

Optimal Stochastic Control, Stochastic Target Problems, and Backward SDE

[...]

Nizar Touzi

27 Sep 2012

TL;DR: In this paper, the authors propose a method for solving control problems by verification, which is based on the Viscosity Solution Equation (VSP) in the sense of VVS.

...read moreread less

Abstract: Preface.- 1. Conditional Expectation and Linear Parabolic PDEs.- 2. Stochastic Control and Dynamic Programming.- 3. Optimal Stopping and Dynamic Programming.- 4. Solving Control Problems by Verification.- 5. Introduction to Viscosity Solutions.- 6. Dynamic Programming Equation in the Viscosity Sense.- 7. Stochastic Target Problems.- 8. Second Order Stochastic Target Problems.- 9. Backward SDEs and Stochastic Control.- 10. Quadratic Backward SDEs.- 11. Probabilistic Numerical Methods for Nonlinear PDEs.- 12. Introduction to Finite Differences Methods.- References.

...read moreread less

244 citations

Proceedings Article•

Time-critical influence maximization in social networks with time-delayed diffusion process

[...]

Wei Chen¹, Wei Lu², Ning Zhang³•Institutions (3)

Microsoft¹, University of British Columbia², University of Science and Technology of China³

22 Jul 2012

TL;DR: Time-critical influence maximization under the time-delayed IC model maintains desired properties such as submodularity, which allows a greedy algorithm to achieve an approximation ratio of 1 - 1/e, to circumvent the NP-hardness of the problem.

...read moreread less

Abstract: Influence maximization is a problem of finding a small set of highly influential users in a social network such that the spread of influence under certain propagation models is maximized. Inthis paper, we consider time-critical influence maximization, in which one wants to maximize influence spread within a given deadline. Since timing is considered in the optimization, we also extend the Independent Cascade (IC) model to incorporate the time delay aspect of influence diffusion in social networks. We show that time-critical influence maximization under the time-delayed IC model maintains desired properties such as submodularity, which allows a greedy algorithm to achieve an approximation ratio of 1 - 1/e, to circumvent the NP-hardness of the problem. To overcome the inefficiency of the approximation algorithm, we design two heuristic algorithms: the first one is based on a dynamic programming procedure that computes exact influence in tree structures, while the second one converts the problem to one in the original IC model and then applies existing fast heuristics to it. Our simulation results demonstrate that our heuristics achieve the same level of influence spread as the greedy algorithm while running a few orders of magnitude faster, and they also outperform existing algorithms that disregard the deadline constraint and delays in diffusion.

...read moreread less

244 citations

Book•

Approximate dynamic programming

[...]

Dimitri P. Bertsekas

01 Jan 2012

TL;DR: Find loads of the approximate dynamic programming book catalogues in this site as the choice of you visiting this page.

...read moreread less

Abstract: Find loads of the approximate dynamic programming book catalogues in this site as the choice of you visiting this page. You can also join to the website book library that will show you numerous books from any types. Literature, science, politics, and many more catalogues are presented to offer you the best book to find. The book that really makes you feels satisfied. Or that's the book that will save you from your job deadline.

...read moreread less

235 citations

Journal Article•DOI•

Component sizing of a plug-in hybrid electric powertrain via convex optimization

[...]

Nikolce Murgovski¹, Lars Johannesson¹, Lars Johannesson², Jonas Sjöberg¹, Bo Egardt¹ - Show less +1 more•Institutions (2)

Chalmers University of Technology¹, Viktoria Institute²

01 Feb 2012-Mechatronics

TL;DR: In this article, the authors present a novel convex modeling approach which allows for a simultaneous optimization of battery size and energy management of a plug-in hybrid powertrain by solving a semidefinite convex problem.

...read moreread less

234 citations

Journal Article•DOI•

Neural-Network-Based Optimal Control for a Class of Unknown Discrete-Time Nonlinear Systems Using Globalized Dual Heuristic Programming

[...]

Derong Liu, Ding Wang, Dongbin Zhao, Qinglai Wei, Ning Jin¹ - Show less +1 more•Institutions (1)

University of Illinois at Chicago¹

22 May 2012-IEEE Transactions on Automation Science and Engineering

TL;DR: The iterative adaptive dynamic programming algorithm using globalized dual heuristic programming technique is introduced to obtain the optimal controller with convergence analysis in terms of cost function and control law for a class of unknown discrete-time nonlinear systems forward-in-time.

...read moreread less

Abstract: In this paper, a neuro-optimal control scheme for a class of unknown discrete-time nonlinear systems with discount factor in the cost function is developed. The iterative adaptive dynamic programming algorithm using globalized dual heuristic programming technique is introduced to obtain the optimal controller with convergence analysis in terms of cost function and control law. In order to carry out the iterative algorithm, a neural network is constructed first to identify the unknown controlled system. Then, based on the learned system model, two other neural networks are employed as parametric structures to facilitate the implementation of the iterative algorithm, which aims at approximating at each iteration the cost function and its derivatives and the control law, respectively. Finally, a simulation example is provided to verify the effectiveness of the proposed optimal control approach. Note to Practitioners-The increasing complexity of the real-world industry processes inevitably leads to the occurrence of nonlinearity and high dimensions, and their mathematical models are often difficult to build. How to design the optimal controller for nonlinear systems without the requirement of knowing the explicit model has become one of the main foci of control practitioners. However, this problem cannot be handled by only relying on the traditional dynamic programming technique because of the "curse of dimensionality". To make things worse, the backward direction of solving process of dynamic programming precludes its wide application in practice. Therefore, in this paper, the iterative adaptive dynamic programming algorithm is proposed to deal with the optimal control problem for a class of unknown nonlinear systems forward-in-time. Moreover, the detailed implementation of the iterative ADP algorithm through the globalized dual heuristic programming technique is also presented by using neural networks. Finally, the effectiveness of the control strategy is illustrated via simulation study.

...read moreread less

229 citations

Journal Article•DOI•

Online Optimal Control of Affine Nonlinear Discrete-Time Systems With Unknown Internal Dynamics by Using Time-Based Policy Update

[...]

Travis Dierks, Sarangapani Jagannathan¹•Institutions (1)

Missouri University of Science and Technology¹

30 May 2012-IEEE Transactions on Neural Networks

TL;DR: The Hamilton-Jacobi-Bellman equation is solved forward-in-time for the optimal control of a class of general affine nonlinear discrete-time systems without using value and policy iterations and the end result is the systematic design of an optimal controller with guaranteed convergence that is suitable for hardware implementation.

...read moreread less

Abstract: In this paper, the Hamilton-Jacobi-Bellman equation is solved forward-in-time for the optimal control of a class of general affine nonlinear discrete-time systems without using value and policy iterations. The proposed approach, referred to as adaptive dynamic programming, uses two neural networks (NNs), to solve the infinite horizon optimal regulation control of affine nonlinear discrete-time systems in the presence of unknown internal dynamics and a known control coefficient matrix. One NN approximates the cost function and is referred to as the critic NN, while the second NN generates the control input and is referred to as the action NN. The cost function and policy are updated once at the sampling instant and thus the proposed approach can be referred to as time-based ADP. Novel update laws for tuning the unknown weights of the NNs online are derived. Lyapunov techniques are used to show that all signals are uniformly ultimately bounded and that the approximated control signal approaches the optimal control input with small bounded error over time. In the absence of disturbances, an optimal control is demonstrated. Simulation results are included to show the effectiveness of the approach. The end result is the systematic design of an optimal controller with guaranteed convergence that is suitable for hardware implementation.

...read moreread less

217 citations

Journal Article•DOI•

A three-network architecture for on-line learning and optimization based on adaptive dynamic programming

[...]

Haibo He¹, Zhen Ni¹, Jian Fu²•Institutions (2)

University of Rhode Island¹, Wuhan University of Technology²

01 Feb 2012-Neurocomputing

TL;DR: This paper presents the detailed design architecture and its associated learning algorithm to explain how effective learning and optimization can be achieved in this new ADP architecture and test the performance both on the cart-pole balancing task and the triple-link inverted pendulum balancing task.

...read moreread less

208 citations

Journal Article•DOI•

Finite-horizon neuro-optimal tracking control for a class of discrete-time nonlinear systems using adaptive dynamic programming approach

[...]

Ding Wang¹, Derong Liu¹, Qinglai Wei¹•Institutions (1)

Chinese Academy of Sciences¹

01 Feb 2012-Neurocomputing

TL;DR: A finite-horizon neuro-optimal tracking control strategy for a class of discrete-time nonlinear systems and three neural networks are used as parametric structures to implement the algorithm, which aims at approximating the cost function, the control law, and the error dynamics.

...read moreread less

Journal Article•DOI•

Active sequential hypothesis testing

[...]

Mohammad Naghshvar, Tara Javidi

20 Mar 2012-arXiv: Information Theory

TL;DR: Lower bounds for the optimal total cost are established using results in dynamic programming and the fundamental limits on the maximum achievable information acquisition rate and the optimal reliability are characterized.

...read moreread less

Abstract: Consider a decision maker who is responsible to dynamically collect observations so as to enhance his information about an underlying phenomena of interest in a speedy manner while accounting for the penalty of wrong declaration. Due to the sequential nature of the problem, the decision maker relies on his current information state to adaptively select the most ``informative'' sensing action among the available ones. In this paper, using results in dynamic programming, lower bounds for the optimal total cost are established. The lower bounds characterize the fundamental limits on the maximum achievable information acquisition rate and the optimal reliability. Moreover, upper bounds are obtained via an analysis of two heuristic policies for dynamic selection of actions. It is shown that the first proposed heuristic achieves asymptotic optimality, where the notion of asymptotic optimality, due to Chernoff, implies that the relative difference between the total cost achieved by the proposed policy and the optimal total cost approaches zero as the penalty of wrong declaration (hence the number of collected samples) increases. The second heuristic is shown to achieve asymptotic optimality only in a limited setting such as the problem of a noisy dynamic search. However, by considering the dependency on the number of hypotheses, under a technical condition, this second heuristic is shown to achieve a nonzero information acquisition rate, establishing a lower bound for the maximum achievable rate and error exponent. In the case of a noisy dynamic search with size-independent noise, the obtained nonzero rate and error exponent are shown to be maximum.

...read moreread less

Book•

Adaptive Dynamic Programming for Control: Algorithms and Stability

[...]

Huaguang Zhang, Derong Liu, Yanhong Luo, Ding Wang

14 Dec 2012

TL;DR: Adaptive Dynamic Programming in Discrete Time (ADPDP-DTM) as discussed by the authors is a generalization of ADP for nonlinear systems with a focus on optimal control.

...read moreread less

Abstract: There are many methods of stable controller design for nonlinear systems. In seeking to go beyond the minimum requirement of stability, Adaptive Dynamic Programming in Discrete Time approaches the challenging topic of optimal control for nonlinear systems using the tools of adaptive dynamic programming (ADP). The range of systems treated is extensive; affine, switched, singularly perturbed and time-delay nonlinear systems are discussed as are the uses of neural networks and techniques of value and policy iteration. The text features three main aspects of ADP in which the methods proposed for stabilization and for tracking and games benefit from the incorporation of optimal control methods: infinite-horizon control for which the difficulty of solving partial differential HamiltonJacobiBellman equations directly is overcome, and proof provided that the iterative value function updating sequence converges to the infimum of all the value functions obtained by admissible control law sequences; finite-horizon control, implemented in discrete-time nonlinear systems showing the reader how to obtain suboptimal control solutions within a fixed number of control steps and with results more easily applied in real systems than those usually gained from infinite-horizon control; nonlinear games for which a pair of mixed optimal policies are derived for solving games both when the saddle point does not exist, and, when it does, avoiding the existence conditions of the saddle point. Non-zero-sum games are studied in the context of a single network scheme in which policies are obtained guaranteeing system stability and minimizing the individual performance function yielding a Nash equilibrium. In order to make the coverage suitable for the student as well as for the expert reader, Adaptive Dynamic Programming in Discrete Time: establishes the fundamental theory involved clearly with each chapter devoted to a clearly identifiable control paradigm; demonstrates convergence proofs of the ADP algorithms to deepen understanding of the derivation of stability and convergence with the iterative computational methods used; and shows how ADP methods can be put to use both in simulation and in real applications. This text will be of considerable interest to researchers interested in optimal control and its applications in operations research, applied mathematics computational intelligence and engineering. Graduate students working in control and operations research will also find the ideas presented here to be a source of powerful methods for furthering their study.

...read moreread less

Proceedings Article•DOI•

Robust control of uncertain Markov Decision Processes with temporal logic specifications

[...]

Eric M. Wolff¹, Ufuk Topcu¹, Richard M. Murray¹•Institutions (1)

California Institute of Technology¹

01 Dec 2012

TL;DR: A procedure from probabilistic model checking is used to combine the system model with an automaton representing the specification and this new MDP is transformed into an equivalent form that satisfies assumptions for stochastic shortest path dynamic programming.

...read moreread less

Abstract: We present a method for designing a robust control policy for an uncertain system subject to temporal logic specifications. The system is modeled as a finite Markov Decision Process (MDP) whose transition probabilities are not exactly known but are known to belong to a given uncertainty set. A robust control policy is generated for the MDP that maximizes the worst-case probability of satisfying the specification over all transition probabilities in this uncertainty set. To this end, we use a procedure from probabilistic model checking to combine the system model with an automaton representing the specification. This new MDP is then transformed into an equivalent form that satisfies assumptions for stochastic shortest path dynamic programming. A robust version of dynamic programming solves for a e-suboptimal robust control policy with time complexity O(log1/e) times that for the non-robust case.

...read moreread less

Journal Article•DOI•

Optimal Management Strategy of a Battery-Based Storage System to Improve Renewable Energy Integration in Distribution Networks

[...]

Samuele Grillo, Mattia Marinelli¹, Stefano Massucco, Federico Silvestro•Institutions (1)

University of Genoa¹

16 Apr 2012-IEEE Transactions on Smart Grid

TL;DR: Wind generation performances can be enhanced and adapted to load demand, obtaining an increased economic gain measured by the difference between the economic revenue obtained with and without the proposed generation shifting policy.

...read moreread less

Abstract: The paper proposes the modeling and the optimal management of a hot-temperature (sodium nickel chloride) battery system coupled with wind generators connected to a medium voltage grid. A discrete-time model of the storage device reproducing the battery main dynamics (i.e., state of charge, temperature, current, protection, and limitation systems) has been developed. The model has been validated through some experimental tests. An optimal management strategy has been implemented based on a forward dynamic programming algorithm, specifically developed to exploit the energy price arbitrage along the optimization time horizon (“generation shifting”). Taking advantage of this strategy wind generation performances can be enhanced and adapted to load demand, obtaining an increased economic gain measured by the difference between the economic revenue obtained with and without the proposed generation shifting policy.

...read moreread less

Book•

Riccati Differential Equations

[...]

William T. Reid, M. R. Chidambara¹•Institutions (1)

Indian Institute of Science¹

17 Jan 2012

Journal Article•DOI•

Wave energy converter control by wave prediction and dynamic programming

[...]

Guang Li¹, George Weiss², Markus Mueller¹, Stuart Townley¹, Michael R. Belmont¹ - Show less +1 more•Institutions (2)

University of Exeter¹, Tel Aviv University²

01 Dec 2012-Renewable Energy

TL;DR: In this paper, a point absorber WEC employing a hydraulic/electric power take-off system is formulated as an optimal control problem with a disturbance input (the sea elevation) and with both state and input constraints.

...read moreread less

Journal Article•DOI•

Object Co-Segmentation Based on Shortest Path Algorithm and Saliency Model

[...]

Fanman Meng, Hongliang Li, Guanghui Liu, King Ngi Ngan¹•Institutions (1)

The Chinese University of Hong Kong¹

01 Oct 2012-IEEE Transactions on Multimedia

TL;DR: This paper proposes a new model that efficiently segments common objects from multiple images by segmenting each original image into a number of local regions based on local region similarities and saliency maps and uses the dynamic programming method to solve the co-segmentation problem.

...read moreread less

Abstract: Segmenting common objects that have variations in color, texture and shape is a challenging problem. In this paper, we propose a new model that efficiently segments common objects from multiple images. We first segment each original image into a number of local regions. Then, we construct a digraph based on local region similarities and saliency maps. Finally, we formulate the co-segmentation problem as the shortest path problem, and we use the dynamic programming method to solve the problem. The experimental results demonstrate that the proposed model can efficiently segment the common objects from a group of images with generally lower error rate than many existing and conventional co-segmentation methods.

...read moreread less

Journal Article•DOI•

Cooperative driving: an ant colony system for autonomous intersection management

[...]

Jia Wu¹, Abdeljalil Abbas-Turki¹, Abdellah El Moudni¹•Institutions (1)

Universite de technologie de Belfort-Montbeliard¹

01 Sep 2012-Applied Intelligence

TL;DR: Experimental results obtained by the simulation of different traffic scenarios show that the AIM based on ACS outperforms the traditional traffic lights and other recent traffic control strategies.

...read moreread less

Abstract: Autonomous intersection management (AIM) is an innovative concept for directing vehicles through the intersections. AIM assumes that the vehicles negotiate the right-of-way. This assumption makes the problem of the intersection management significantly different from the usually studied ones such as the optimization of the cycle time, splits, and offsets. The main difficulty is to define a strategy that improves the traffic efficiency. Indeed, due to the fact that each vehicle is considered individually, AIM faces a combinatorial optimization problem that needs quick and efficient solutions for a real time application. This paper proposes a strategy that evacuates vehicles as soon as possible for each sequence of vehicle arrivals. The dynamic programming (DP) that gives the optimal solution is shown to be greedy. A combinatorial explosion is observed if the number of lanes rises. After evaluating the time complexity of the DP, the paper proposes an ant colony system (ACS) to solve the control problem for large number of vehicles and lanes. The complete investigation shows that the proposed ACS algorithm is robust and efficient. Experimental results obtained by the simulation of different traffic scenarios show that the AIM based on ACS outperforms the traditional traffic lights and other recent traffic control strategies.

...read moreread less

Journal Article•DOI•

Robust vehicle routing problem with deadlines and travel time/demand uncertainty

[...]

Chungmok Lee¹, Kyungsik Lee², Sungsoo Park³•Institutions (3)

Electronics and Telecommunications Research Institute¹, Hankuk University of Foreign Studies², KAIST³

01 Sep 2012-Journal of the Operational Research Society

TL;DR: A Dantzig-Wolfe decomposition approach is proposed, which enables the uncertainty of the data to be encapsulated in the column generation subproblem, and a dynamic programming algorithm is proposed to solve the subproblem with data uncertainty.

...read moreread less

Abstract: In this article, we investigate the vehicle routing problem with deadlines, whose goal is to satisfy the requirements of a given number of customers with minimum travel distances while respecting both of the deadlines of the customers and vehicle capacity. It is assumed that the travel time between any two customers and the demands of the customer are uncertain. Two types of uncertainty sets with adjustable parameters are considered for the possible realizations of travel time and demand. The robustness of a solution against the uncertain data can be achieved by making the solution feasible for any travel time and demand defined in the uncertainty sets. We propose a Dantzig-Wolfe decomposition approach, which enables the uncertainty of the data to be encapsulated in the column generation subproblem. A dynamic programming algorithm is proposed to solve the subproblem with data uncertainty. The results of computational experiments involving two well-known test problems show that the robustness of the solution can be greatly improved.

...read moreread less

Journal Article•DOI•

Robust Adaptive Dynamic Programming for Large-Scale Systems With an Application to Multimachine Power Systems

[...]

Yu Jiang¹, Zhong-Ping Jiang¹•Institutions (1)

New York University¹

07 Sep 2012-IEEE Transactions on Circuits and Systems Ii-express Briefs

TL;DR: An iterative control algorithm is given to devise a decentralized optimal controller that globally asymptotically stabilizes the system in question and is demonstrated via the online learning control of multimachine power systems with governor controllers.

...read moreread less

Abstract: This brief presents a new approach to decentralized control design of complex systems with unknown parameters and dynamic uncertainties. A key strategy is to use the theory of robust adaptive dynamic programming and the policy iteration technique. An iterative control algorithm is given to devise a decentralized optimal controller that globally asymptotically stabilizes the system in question. Stability analysis is accomplished by means of the small-gain theorem. The effectiveness of the proposed computational control algorithm is demonstrated via the online learning control of multimachine power systems with governor controllers.

...read moreread less

Journal Article•DOI•

Explicit MPC for LPV Systems: Stability and Optimality

[...]

Thomas Besselmann, Johan Löfberg¹, Manfred Morari²•Institutions (2)

Linköping University¹, ETH Zurich²

10 Feb 2012-IEEE Transactions on Automatic Control

TL;DR: This paper considers high-speed control of constrained linear parameter-varying systems using model predictive control, and gathers previous developments and provides new material such as a proof for the optimality of the solution, or, in the case of close-to-optimal solutions, a procedure to determine a bound on the suboptimality ofThe solution.

...read moreread less

Abstract: This paper considers high-speed control of constrained linear parameter-varying systems using model predictive control. Existing model predictive control schemes for control of constrained linear parameter-varying systems typically require the solution of a semi-definite program at each sampling instance. Recently, variants of explicit model predictive control were proposed for linear parameter-varying systems with polytopic representation, decreasing the online computational effort by orders of magnitude. Depending on the mathematical structure of the underlying system, the constrained finite-time optimal control problem can be solved optimally, or close-to-optimal solutions can be computed. Constraint satisfaction, recursive feasibility and asymptotic stability can be guaranteed a priori by an appropriate selection of the terminal state constraints and terminal cost. The paper at hand gathers previous developments and provides new material such as a proof for the optimality of the solution, or, in the case of close-to-optimal solutions, a procedure to determine a bound on the suboptimality of the solution.

...read moreread less

Posted Content•

Improved Memory-Bounded Dynamic Programming for Decentralized POMDPs

[...]

Sven Seuken¹, Shlomo Zilberstein²•Institutions (2)

Harvard University¹, University of Massachusetts Amherst²

20 Jun 2012-arXiv: Artificial Intelligence

TL;DR: Memory-Bounded Dynamic Programming is generalized and its scalability is improved by reducing the complexity with respect to the number of observations from exponential to polynomial, and error bounds on solution quality are derived.

...read moreread less

Abstract: Memory-Bounded Dynamic Programming (MBDP) has proved extremely effective in solving decentralized POMDPs with large horizons. We generalize the algorithm and improve its scalability by reducing the complexity with respect to the number of observations from exponential to polynomial. We derive error bounds on solution quality with respect to this new approximation and analyze the convergence behavior. To evaluate the effectiveness of the improvements, we introduce a new, larger benchmark problem. Experimental results show that despite the high complexity of decentralized POMDPs, scalable solution techniques such as MBDP perform surprisingly well.

...read moreread less

Journal Article•DOI•

Control Strategies for Fuel-Cell-Based Hybrid Electric Vehicles: From Offline to Online and Experimental Results

[...]

Alexandre Ravey¹, Benjamin Blunier¹, Abdellatif Miraoui¹•Institutions (1)

Universite de technologie de Belfort-Montbeliard¹

10 May 2012-IEEE Transactions on Vehicular Technology

TL;DR: This paper describes two different control strategies for a fuel-cell-based hybrid electric vehicle (FCHEV) that are based on dynamic programming, and the online strategy is based on an optimized fuzzy logic controller.

...read moreread less

Abstract: This paper describes two different control strategies for a fuel-cell-based hybrid electric vehicle (FCHEV). The offline strategy is based on dynamic programming, and the online strategy is based on an optimized fuzzy logic controller. These two strategies are then compared. Finally, the fuzzy logic controller is validated using a real FCHEV.

...read moreread less

Posted Content•

Time-Critical Influence Maximization in Social Networks with Time-Delayed Diffusion Process

[...]

Wei Chen¹, Wei Lu², Ning Zhang³•Institutions (3)

Microsoft¹, University of British Columbia², University of Science and Technology of China³

13 Apr 2012-arXiv: Social and Information Networks

TL;DR: In this article, the authors consider time-critical influence maximization, in which one wants to maximize influence spread within a given deadline, and extend the Independent Cascade (IC) model and the Linear Threshold (LT) model to incorporate the time delay aspect of influence diffusion among individuals in social networks.

...read moreread less

Abstract: Influence maximization is a problem of finding a small set of highly influential users, also known as seeds, in a social network such that the spread of influence under certain propagation models is maximized. In this paper, we consider time-critical influence maximization, in which one wants to maximize influence spread within a given deadline. Since timing is considered in the optimization, we also extend the Independent Cascade (IC) model and the Linear Threshold (LT) model to incorporate the time delay aspect of influence diffusion among individuals in social networks. We show that time-critical influence maximization under the time-delayed IC and LT models maintains desired properties such as submodularity, which allows a greedy approximation algorithm to achieve an approximation ratio of $1-1/e$. To overcome the inefficiency of the greedy algorithm, we design two heuristic algorithms: the first one is based on a dynamic programming procedure that computes exact influence in tree structures and directed acyclic subgraphs, while the second one converts the problem to one in the original models and then applies existing fast heuristic algorithms to it. Our simulation results demonstrate that our algorithms achieve the same level of influence spread as the greedy algorithm while running a few orders of magnitude faster, and they also outperform existing fast heuristics that disregard the deadline constraint and delays in diffusion.

...read moreread less

Journal Article•DOI•

Delay-Aware BS Discontinuous Transmission Control and User Scheduling for Energy Harvesting Downlink Coordinated MIMO Systems

[...]

Ying Cui¹, Vincent K. N. Lau¹, Yueping Wu¹•Institutions (1)

Hong Kong University of Science and Technology¹

01 Jul 2012-IEEE Transactions on Signal Processing

TL;DR: A delay-aware distributed solution with the BS-DTX control at the BS controller (BSC) and the user scheduling at each cluster manager (CM) using approximate dynamic programming and distributed stochastic learning is obtained and the proposed distributed two-timescale algorithm converges almost surely.

...read moreread less

Abstract: In this paper, we propose a two-timescale delay-optimal base station discontinuous transmission (BS-DTX) control and user scheduling for downlink coordinated MIMO systems with energy harvesting capability. To reduce the complexity and signaling overhead in practical systems, the BS-DTX control is adaptive to both the energy state information (ESI) and the data queue state information (QSI) over a longer timescale. The user scheduling is adaptive to the ESI, the QSI and the channel state information (CSI) over a shorter timescale. We show that the two-timescale delay-optimal control problem can be modeled as an infinite horizon average cost partially observed Markov decision problem (POMDP), which is well known to be a difficult problem in general. By using sample-path analysis and exploiting specific problem structure, we first obtain some structural results on the optimal control policy and derive an equivalent Bellman equation with reduced state space. To reduce the complexity and facilitate distributed implementation, we obtain a delay-aware distributed solution with the BS-DTX control at the BS controller (BSC) and the user scheduling at each cluster manager (CM) using approximate dynamic programming and distributed stochastic learning. We show that the proposed distributed two-timescale algorithm converges almost surely. Furthermore, using queueing theory, stochastic geometry, and optimization techniques, we derive sufficient conditions for the data queues to be stable in the coordinated MIMO network and discuss various design insights.

...read moreread less

Posted Content•

Dynamic Programming for Structured Continuous Markov Decision Problems

[...]

Zhengzhu Feng¹, Richard Dearden², Nicolas Meuleau², Richard Washington²•Institutions (2)

University of Massachusetts Amherst¹, Ames Research Center²

11 Jul 2012-arXiv: Artificial Intelligence

TL;DR: In this article, the state space is dynamically partitioned into regions where the value function is the same throughout the region, where the state variables can be expressed by piecewise constant representations.

...read moreread less

Abstract: We describe an approach for exploiting structure in Markov Decision Processes with continuous state variables. At each step of the dynamic programming, the state space is dynamically partitioned into regions where the value function is the same throughout the region. We first describe the algorithm for piecewise constant representations. We then extend it to piecewise linear representations, using techniques from POMDPs to represent and reason about linear surfaces efficiently. We show that for complex, structured problems, our approach exploits the natural structure so that optimal solutions can be computed efficiently.

...read moreread less

Journal Article•DOI•

2012 Special Issue: An iterative ε-optimal control scheme for a class of discrete-time nonlinear systems with unfixed initial state

[...]

Qinglai Wei¹, Derong Liu¹•Institutions (1)

Chinese Academy of Sciences¹

01 Aug 2012-Neural Networks

TL;DR: A new ϵ-optimal control algorithm based on the iterative ADP approach is proposed that makes the performance index function iteratively converge to the greatest lower bound of all performance indices within an error ϵ in finite time.

...read moreread less

Journal Article•DOI•

An intermodal transport network planning algorithm using dynamic programming--A case study: from Busan to Rotterdam in intermodal freight routing

[...]

Jae Hyung Cho¹, Hyun Soo Kim², Hyung Rim Choi²•Institutions (2)

Pusan National University¹, Dong-a University²

01 Apr 2012-Applied Intelligence

TL;DR: A dynamic programming algorithm to draw optimal intermodal freight routing with regard to international logistics of container cargo for export and import and a Weighted Constrained Shortest Path Problem (WCSPP) model are presented.

...read moreread less

Abstract: This paper presents a dynamic programming algorithm to draw optimal intermodal freight routing with regard to international logistics of container cargo for export and import. This study looks into the characteristics of intermodal transport using multi-modes, and presents a Weighted Constrained Shortest Path Problem (WCSPP) model. This study draws Pareto optimal solutions that can simultaneously meet two objective functions by applying the Label Setting algorithm, a type of Dynamic Programming algorithms, after setting the feasible area. To improve the algorithm performance, pruning rules have also been presented. The algorithm is applied to real transport paths from Busan to Rotterdam, as well as to large-scale cases. This study quantitatively measures the savings in both transport cost and time by comparing single transport modes with intermodal transport paths. Last, this study applies a mathematical model and MADM model to the multiple Pareto optimal solutions to estimate the solutions.

...read moreread less

Journal Article•DOI•

Improved Dynamic Programming for Reservoir Operation Optimization with a Concave Objective Function

[...]

Tongtiegang Zhao¹, Ximing Cai¹, Xiaohui Lei, Hao Wang•Institutions (1)

University of Illinois at Urbana–Champaign¹

01 Nov 2012-Journal of Water Resources Planning and Management

TL;DR: In this paper, the assumption of diminishing marginal utility (i.e., concavity) of reservoir utility functions is used to model the water resources system, and the authors show that this is an important characteristic of water resources systems.

...read moreread less

Abstract: Diminishing marginal utility is an important characteristic of water resources systems. With the assumption of diminishing marginal utility (i.e., concavity) of reservoir utility functions,...

...read moreread less

Collapse