scispace - formally typeset
Search or ask a question

Showing papers by "Swaroop Darbha published in 2011"


Journal ArticleDOI
TL;DR: A reward‐based aggregation method to construct suboptimal policies for a perimeter surveillance control problem which gives rise to a large scale Markov chain by circumventing the need for value iteration over the entire state space.
Abstract: One encounters the curse of dimensionality in the application of dynamic programming to determine optimal policies for large-scale controlled Markov chains. In this paper, we provide a reward-based aggregation method to construct suboptimal policies for a perimeter surveillance control problem which gives rise to a large scale Markov chain. The novelty of this approach lies in circumventing the need for value iteration over the entire state space. Instead, the state space is partitioned and the value function is approximated by a constant over each partition. We associate a meta-state with each partition, where the transition probabilities between these meta-states are known. The state aggregation approach results in a significant reduction in the computational burden and lends itself to value iteration over the aggregated state-space. We provide bounds to assess the quality of the approximation and give numerical results that support the proposed methodology. Published 2011. This article is a US Government work and is in the public domain in the USA.

25 citations


Proceedings ArticleDOI
01 Dec 2011
TL;DR: A method to restrict the stabilizing set for LTI systems further by using Widder's theorem and Markov-Lucaks representation for polynomials that are non-negative on the positive real axis and a method to arbitrarily tighten this set of desired controllers.
Abstract: This paper presents a new method for approximating the set of PID controllers satisfying a class of transient specifications. The problem of designing a controller to satisfy transient specifications such as the maximum allowable overshoot to a given input or the response being required to be within an envelope can be cast as a problem of guaranteeing the impulse response of an appropriate closed loop error transfer function to be non-negative. Stabilizing PID controllers for Linear Time Invariant (LTI) systems can be synthesized as a union of convex polygons in k i − k d space for k p 's lying in a specific range. In this paper, we provide a method to restrict the stabilizing set for LTI systems further by using Widder's theorem and Markov-Lucaks representation for polynomials that are non-negative on the positive real axis. Widder's theorem provides necessary and sufficient conditions for the error response to be non-negative and upon an application of Widder's theorem, we obtain a sequence of polynomials, whose coefficients are polynomial functions of k p , k i and k d to be non-negative. For every polynomial in the sequence and for a specified k p , using Markov-Lucaks theorem and Minkowski's projection, we obtain a polynomial inequality in k i and k d that must be satisfied by every controller satisfying the desired transient specification. We also provide a method to arbitrarily tighten this set of desired controllers.

11 citations


Proceedings ArticleDOI
18 Aug 2011
TL;DR: The reduced order DP has been shown analytically to give the exact same solution that one would obtain via performing DP on the original full state space Markov chain.
Abstract: A reduced order Dynamic Programming (DP) method that efficiently computes the optimal policy and value function for a class of controlled Markov chains is developed. We assume that the Markov chains exhibit the property that a subset of the states have a single (default) control action associated with them. Furthermore, we assume that the transition probabilities between the remaining (decision) states can be derived from the original Markov chain specification. Under these assumptions, the suggested reduced order DP method yields significant savings in computation time and also leads to faster convergence to the optimal solution. Most importantly, the reduced order DP has been shown analytically to give the exact same solution that one would obtain via performing DP on the original full state space Markov chain. The method is illustrated via a multi UAV perimeter patrol stochastic optimal control problem.

10 citations


Posted Content
TL;DR: In this article, a linear programming approach is proposed to construct sub-optimal policies along with a bound for the deviation of such a policy from the optimum via a linear program approach.
Abstract: One often encounters the curse of dimensionality in the application of dynamic programming to determine optimal policies for controlled Markov chains. In this paper, we provide a method to construct sub-optimal policies along with a bound for the deviation of such a policy from the optimum via a linear programming approach. The state-space is partitioned and the optimal cost-to-go or value function is approximated by a constant over each partition. By minimizing a non-negative cost function defined on the partitions, one can construct an approximate value function which also happens to be an upper bound for the optimal value function of the original Markov Decision Process (MDP). As a key result, we show that this approximate value function is {\it independent} of the non-negative cost function (or state dependent weights as it is referred to in the literature) and moreover, this is the least upper bound that one can obtain once the partitions are specified. Furthermore, we show that the restricted system of linear inequalities also embeds a family of MDPs of lower dimension, one of which can be used to construct a lower bound on the optimal value function. The construction of the lower bound requires the solution to a combinatorial problem. We apply the linear programming approach to a perimeter surveillance stochastic optimal control problem and obtain numerical results that corroborate the efficacy of the proposed methodology.

9 citations


Journal ArticleDOI
TL;DR: An approximation algorithm and heuristics are developed to solve an important routing problem that arises in surveillance applications involving two heterogeneous vehicles and shows that the algorithms based on the second LP model, on an average, provided better (closer to the optimum) solutions as compared with thosebased on the first LP model.
Abstract: This article addresses an important routing problem that arises in surveillance applications involving two heterogeneous vehicles. As the addressed routing problem is NP-Hard, we develop an approximation algorithm and heuristics to solve the problem. Our approach involves solving the routing problem in two main steps: Partitioning and Sequencing. Partitioning involves finding a distinct set of targets to be visited by each vehicle. Sequencing provides the order in which each vehicle must visit the subset of targets assigned to it. The problem of partitioning is tackled by solving a linear program (LP) obtained by relaxing some of the constraints of an integer programming model for the problem. We consider two LP models for partitioning. The first LP model is obtained by mainly relaxing both the integrality and degree constraints, whereas the second model relaxes mainly the integrality constraints. Once the targets are partitioned, the sequencing problem can be solved either by Hoogeveen's algorithm or by the Lin–Kernighan heuristic to yield an approximately optimal solution. Computational results show that the algorithms based on the second LP model, on an average, provided better (closer to the optimum) solutions as compared with those based on the first LP model. We also observed that for both the LP models, the average quality of solutions given by the heuristics were found to be within 4% of the optimum, whereas the average quality of solutions obtained from the approximation algorithms were within 8–20% of the optimum depending on the problem size. Copyright © 2011 John Wiley & Sons, Ltd.

9 citations


Proceedings ArticleDOI
01 Jun 2011
TL;DR: In this article, the authors presented the first approximation algorithm for a routing problem that is frequently encountered in the motion planning of Unmanned Vehicles (UVs) and considered problem is a variant of a multiple depot-terminal Hamiltonian path problem and is stated as follows: there is a collection of m UVs equipped with different sensors on-board and there are n targets to be visited by them collectively.
Abstract: In this article, we present the first approximation algorithm for a routing problem that is frequently encountered in the motion planning of Unmanned Vehicles (UVs). The considered problem is a variant of a Multiple Depot-Terminal Hamiltonian Path Problem and is stated as follows: There is a collection of m UVs equipped with different sensors on-board and there are n targets to be visited by them collectively. There are restrictions on the targets of the following type: (1) A target may be visited by any UV, (2) a target must be visited only by a subset of UVs (with appropriate on-board sensor) and (3) a target may not be visited by a subset of UVs (as the set of on-board sensors on the UV may not be suitable for viewing the targets). The UVs are otherwise identical from the viewpoint of dynamic constraints on their motion and hence, the cost of traveling from a target A to a target B is the same for all vehicles. We will assume that triangle inequality is satisfied by the cost associated with travel, i.e., it is cheaper to travel from a target A to a target B directly than to go via an intermediate target C. The UVs may possibly start from different locations (referred to as depots) and are not required to return to the depot. While there are different objectives that can be considered for this problem, we consider the total cost of travel of all the UVs as an objective to be minimized. The problem considered in this article is a generalized version of single depot-terminal Hamiltonian Path Problem and is NP-hard.

7 citations


Proceedings ArticleDOI
01 Jan 2011
TL;DR: In this article, a new method of synthesizing digital PID controllers for discrete-time, linear-time invariant (LTI) systems satisfying a class of transient response specifications is presented.
Abstract: In this paper, we present a new method of synthesizing digital PID controllers for discrete-time, Linear Time Invariant (LTI) Systems satisfying a class of transient response specifications. The problem of synthesizing a controller to achieve desirable transient specifications, such as requiring the transient response to be within an allowable range of overshoot, can be carried out as a problem of guaranteeing the impulse response of an appropriate closed loop error transfer function to be non-negative. An earlier result by the authors provides necessary and sufficient conditions for the impulse response of a discrete-time transfer function to be non-negative in terms of the requirement of a sequence of polynomials to be sign-invariant on the interval [1, ∞) . An application of this result to the error transfer function yields a sequence of polynomials which are required to be sign-invariant on [1, ∞) but whose coefficients are polynomial functions of the controller gains k1 , k2 and k3 .Copyright © 2011 by ASME

7 citations


Journal ArticleDOI
TL;DR: In this paper, a new method for computing stabilizing fixed structure/order controllers using Groebner bases and sign-definite decomposition is presented, which simplifies the construction of the set of stabilizing controllers.

5 citations


Journal ArticleDOI
TL;DR: In this paper, the authors present a mathematical model for an air brake system in the presence of leaks, with a view towards developing a diagnostic system for the air brake systems based on the models.
Abstract: Brake systems in trucks are crucial for ensuring the safety of vehicles and passengers on the roadways. Most trucks in the USA are equipped with S-cam drum brake systems and they are sensitive to maintenance. Brake deficiencies such as leaks and out-of-adjustment of the pushrod are a major cause of accidents involving trucks. Leaks in the air brake systems drastically affect braking performance by decreasing the maximum attainable braking pressure and also increasing the time required to attain the same, thereby resulting in longer stopping distances. Out-of-adjustment of the pushrod leads to loss of braking torque even if no leaks are present in the air brake system. In this paper, we present a mathematical model for an air brake system in the presence of leaks, with a view towards developing a diagnostic system for the air brake system based on the models. Additionally, we present a scheme that estimates the severity of leak in terms of the mass flow rate of air leaking from the air brake system to the ...

2 citations


Journal ArticleDOI
TL;DR: In this article, the authors consider the problem of parameter estimation in an air brake system, where the clearance between the brake pads and the drum can vary due to a variety of factors, such as brake pad wear or brake fade.

1 citations