Showing papers by "Robert Babuska published in 2011"

PDF

Open Access

Journal Article•DOI•

Cross-Entropy Optimization of Control Policies With Adaptive Basis Functions

[...]

Lucian Busoniu¹, Damien Ernst, Bart De Schutter¹, Robert Babuska¹•Institutions (1)

01 Feb 2011

TL;DR: An algorithm for direct search of control policies in continuous-state discrete-action Markov decision processes, which requires vastly fewer BFs than value-function techniques with equidistant BFs, and outperforms policy search with a competing optimization algorithm called DIRECT.

...read moreread less

Abstract: This paper introduces an algorithm for direct search of control policies in continuous-state discrete-action Markov decision processes. The algorithm looks for the best closed-loop policy that can be represented using a given number of basis functions (BFs), where a discrete action is assigned to each BF. The type of the BFs and their number are specified in advance and determine the complexity of the representation. Considerable flexibility is achieved by optimizing the locations and shapes of the BFs, together with the action assignments. The optimization is carried out with the cross-entropy method and evaluates the policies by their empirical return from a representative set of initial states. The return for each representative state is estimated using Monte Carlo simulations. The resulting algorithm for cross-entropy policy search with adaptive BFs is extensively evaluated in problems with two to six state variables, for which it reliably obtains good policies with only a small number of BFs. In these experiments, cross-entropy policy search requires vastly fewer BFs than value-function techniques with equidistant BFs, and outperforms policy search with a competing optimization algorithm called DIRECT.

...read moreread less

81 citations

Proceedings Article•DOI•

Observers for linear distributed-parameter systems: A survey

[...]

Z. Hidayat¹, Robert Babuska¹, B. De Schutter¹, Alfredo Núñez¹•Institutions (1)

Delft University of Technology¹

24 Oct 2011

TL;DR: This paper surveys observers for first-order and second-order linear distributed-parameter systems based on their infinite-dimensional and finite-dimensional descriptions.

...read moreread less

Abstract: This paper reviews different observer design methods for linear dynamic distributed-parameter systems. In such systems, the states, inputs, and outputs depend on some spatial variable. This dependence, along with additional aspects such as the boundary conditions, increase the complexity of the state estimation problem and of the design methods. The paper in particular surveys observers for first-order and second-order linear distributed-parameter systems based on their infinite-dimensional and finite-dimensional descriptions.

...read moreread less

74 citations

Proceedings Article•DOI•

Approximate reinforcement learning: An overview

[...]

Lucian Busoniu¹, Damien Ernst², Bart De Schutter¹, Robert Babuska¹•Institutions (2)

Delft University of Technology¹, University of Liège²

11 Apr 2011

TL;DR: An overview of methods for approximate RL, starting from their dynamic programming roots and organizing them into three major classes: approximate value iteration, policy iteration, and policy search, which compares the different categories of methods and outlines possible ways to enhance the reviewed algorithms.

...read moreread less

Abstract: Reinforcement learning (RL) allows agents to learn how to optimally interact with complex environments. Fueled by recent advances in approximation-based algorithms, RL has obtained impressive successes in robotics, artificial intelligence, control, operations research, etc. However, the scarcity of survey papers about approximate RL makes it difficult for newcomers to grasp this intricate field. With the present overview, we take a step toward alleviating this situation. We review methods for approximate RL, starting from their dynamic programming roots and organizing them into three major classes: approximate value iteration, policy iteration, and policy search. Each class is subdivided into representative categories, highlighting among others offline and online algorithms, policy gradient methods, and simulation-based techniques. We also compare the different categories of methods, and outline possible ways to enhance the reviewed algorithms.

...read moreread less

59 citations

Journal Article•DOI•

Performance improvement of a drop-on-demand inkjet printhead using an optimization-based feedforward control method

[...]

Amol A. Khalate¹, Xavier Bombois¹, Robert Babuska¹, Herman Wijshoff, R. Waarsing - Show less +1 more•Institutions (1)

Delft University of Technology¹

01 Aug 2011-Control Engineering Practice

TL;DR: In this article, the authors proposed an optimization-based method to design the input actuation waveform for the piezo actuator in order to improve the damping of the residual oscillations.

...read moreread less

58 citations

Proceedings Article•DOI•

On distributed maximization of algebraic connectivity in robotic networks

[...]

Andrea Simonetto¹, Tamas Keviczky¹, Robert Babuska¹•Institutions (1)

Delft University of Technology¹

18 Aug 2011

TL;DR: In this article, the authors consider the problem of maximizing the algebraic connectivity of the communication graph in a network of mobile robots by moving them into appropriate positions and formulate an approximate problem as a Semi-Definite Program (SDP).

...read moreread less

Abstract: We consider the problem of maximizing the algebraic connectivity of the communication graph in a network of mobile robots by moving them into appropriate positions. We describe the Laplacian of the graph as dependent on the pairwise distance between the robots and formulate an approximate problem as a Semi-Definite Program (SDP). We propose a consistent, non-iterative distributed solution by solving local SDP's which use information only from nearby neighboring robots. Numerical simulations show the performance of the algorithm with respect to the centralized solution.

...read moreread less

25 citations

Proceedings Article•DOI•

Optimistic planning for sparsely stochastic systems

[...]

Lucian Busoniu¹, Rémi Munos², Bart De Schutter¹, Robert Babuska¹•Institutions (2)

Delft University of Technology¹, French Institute for Research in Computer Science and Automation²

11 Apr 2011

TL;DR: An online planning algorithm for finite-action, sparsely stochastic Markov decision processes, in which the random state transitions can only end up in a small number of possible next states is proposed, including the successful online control of a simulated HIV infection with Stochastic drug effectiveness.

...read moreread less

Abstract: We propose an online planning algorithm for finite-action, sparsely stochastic Markov decision processes, in which the random state transitions can only end up in a small number of possible next states. The algorithm builds a planning tree by iteratively expanding states, where each expansion exploits sparsity to add all possible successor states. Each state to expand is actively chosen to improve the knowledge about action quality, and this allows the algorithm to return a good action after a strictly limited number of expansions. More specifically, the active selection method is optimistic in that it chooses the most promising states first, so the novel algorithm is called optimistic planning for sparsely stochastic systems. We note that the new algorithm can also be seen as model-predictive (receding-horizon) control. The algorithm obtains promising numerical results, including the successful online control of a simulated HIV infection with stochastic drug effectiveness.

...read moreread less

23 citations

Proceedings Article•DOI•

Decentralized Kalman filter comparison for distributed-parameter systems: A case study for a 1D heat conduction process

[...]

Z. Hidayat¹, Robert Babuska¹, B. De Schutter¹, Alfredo Núñez¹•Institutions (1)

Delft University of Technology¹

01 Sep 2011

TL;DR: Four methods for decentralized Kalman filtering for distributed-parameter systems, which after spatial and temporal discretization, result in large-scale linear discrete-time systems are compared.

...read moreread less

Abstract: In this paper we compare four methods for decentralized Kalman filtering for distributed-parameter systems, which after spatial and temporal discretization, result in large-scale linear discrete-time systems. These methods are: parallel information filter, distributed information filter, distributed Kalman filter with consensus filter, and distributed Kalman filter with weighted averaging. These filters are suitable for sensor networks, where the sensor nodes perform not only sensing and computations, but also communicate estimates among each other. We consider an application of sensor networks to a heat conduction process. The performance of the decentralized filters is evaluated and compared to the centralized Kalman filter.

...read moreread less

21 citations

Journal Article•DOI•

Sequential stability analysis and observer design for distributed TS fuzzy systems

[...]

Zs. Lendek¹, Robert Babuska², B. De Schutter²•Institutions (2)

Technical University of Cluj-Napoca¹, Delft University of Technology²

01 Jul 2011-Fuzzy Sets and Systems

TL;DR: This paper proposes sequential stability analysis and observer design for distributed systems where the subsystems are represented by Takagi-Sugeno (TS) fuzzy models, allowing for the online addition of new subsystems.

...read moreread less

16 citations

Proceedings Article•DOI•

A new ant colony routing approach with a trade-off between system and user optimum

[...]

Zhe Cong¹, Bart De Schutter¹, Robert Babuska¹•Institutions (1)

Delft University of Technology¹

18 Nov 2011

TL;DR: This paper considers the DTR problem for a traffic network defined as a directed graph, and deals with the mathematical aspects of the resulting optimization problem from the viewpoint of network flow theory.

...read moreread less

Abstract: Dynamic traffic routing (DTR) refers to the process of (re)directing traffic at junctions in a traffic network corresponding to the evolving traffic conditions as time progresses. This paper considers the DTR problem for a traffic network defined as a directed graph, and deals with the mathematical aspects of the resulting optimization problem from the viewpoint of network flow theory. Traffic networks may have thousands of links and nodes, resulting in a sizable and computationally complex nonlinear, non-convex DTR optimization problem. To solve this problem Ant Colony Optimization (ACO) is chosen as the optimization method in this paper because of its powerful optimization heuristic for combinatorial optimization problems. However, the standard ACO algorithm is not capable of solving the routing optimization problem aimed at the system optimum, and therefore a new ACO algorithm is developed to achieve the goal of finding the optimal distribution of traffic flows in the network.

...read moreread less

16 citations

Proceedings Article•DOI•

Adaptive fuzzy and sliding-mode control of a robot manipulator with varying payload

[...]

Selami Beyhan¹, Zsofia Lendek², Robert Babuska², Martijn Wisse², Musa Alci¹ - Show less +1 more•Institutions (2)

Ege University¹, Delft University of Technology²

01 Dec 2011

TL;DR: This paper compares indirect adaptive fuzzy control and sliding-mode control in a robot manipulator application that performs pick-and-place tasks with unknown and variable payloads and finds the sliding mode controller obtains a very good steady performance.

...read moreread less

Abstract: In this paper, we compare indirect adaptive fuzzy control and sliding-mode control in a robot manipulator application. The manipulator performs pick-and-place tasks with unknown and variable payloads. The change of payload causes large variations in the dynamics of the robot. The sliding-mode controller deals with the payload change through its inherent robustness, while the adaptive fuzzy control algorithm adjusts the controller's parameters on-line. The control methods are compared both in numerical simulations and in real-time experiments. The sliding mode controller obtains a very good steady performance. However, thanks to the continuing adaptation, the adaptive fuzzy controller eventually yields smaller steady-state error.

...read moreread less

13 citations

Journal Article•DOI•

Actor-Critic Control with Reference Model Learning

[...]

I. Grondman¹, Maarten Vaandrager Lucian Busoniu¹, Robert Babuska¹, Erik Schuitema¹•Institutions (1)

Delft University of Technology¹

01 Jan 2011-IFAC Proceedings Volumes

TL;DR: The novel method and a standard actor-critic algorithm are applied to the pendulum swingup problem, in which the novel method achieves faster learning than the standard algorithm.

...read moreread less

Proceedings Article•DOI•

Drop-on-demand inkjet printhead performance improvement using robust feedforward control

[...]

Amol A. Khalate¹, Benoit Bayon², Xavier Bombois¹, Gérard Scorletti², Robert Babuska¹ - Show less +1 more•Institutions (2)

Delft University of Technology¹, École centrale de Lyon²

12 Dec 2011

TL;DR: This paper proposes a robust optimization-based method to design the input actuation waveform for the piezo actuator in order to improve the damping of the residual oscillations in the presence of parametric uncertainties in the ink-channel model.

...read moreread less

Abstract: The printing quality delivered by a Drop-on-Demand (DoD) inkjet printhead is mainly limited due to the residual oscillations in the ink channel. The maximal jetting frequency of a DoD inkjet printhead can be increased by quickly damping the residual oscillations and by bringing in this way the ink-channel to rest after jetting the ink drop. The inkjet channel model obtained is generally subjected to parametric uncertainty. This paper proposes a robust optimization-based method to design the input actuation waveform for the piezo actuator in order to improve the damping of the residual oscillations in the presence of parametric uncertainties in the ink-channel model. Simulation results are presented to show the efficacy of the proposed method.

...read moreread less

Journal Article•DOI•

Robust Feedforward control for a Drop-on-Demand Inkjet Printhead

[...]

Amol A. Khalate¹, Xavier Bombois¹, Gérard Scorletti², Robert Babuska¹, R. Waarsing, Wim de Zeeuw - Show less +2 more•Institutions (2)

Delft University of Technology¹, École centrale de Lyon²

01 Jan 2011-IFAC Proceedings Volumes

TL;DR: In this article, a robust optimization-based method is proposed to design the input actuation waveform for the piezo actuator in order to improve the damping of the residual oscillations in the presence of parametric uncertainties in the ink-channel model.

...read moreread less

Proceedings Article•DOI•

Saturated particle filter

[...]

Pawel M. Stano¹, Zsofia Lendek¹, Robert Babuska¹•Institutions (1)

Delft University of Technology¹

18 Aug 2011

TL;DR: This paper proposes the Saturated Particle Filter algorithm which incorporates the measurements into the importance sampling procedure through the detection function, and achieves better performance than the standard Constrained SIR filter, while it preserves low computational complexity.

...read moreread less

Abstract: In many practical applications the state variables are defined on a compact set of the state space. For estimating such variables constrained particle filters have been successfully applied to nonlinear systems. For the saturated system the measurement information can be used during the sampling procedure to obtain particles that approximate the true state of the system. This can be achieved by using a detection function, which detects the saturation as it occurs. In this paper we propose the Saturated Particle Filter algorithm which incorporates the measurements into the importance sampling procedure through the detection function. The new filter is applied to the Lindley-type stochastic process, where the stochastic process depends on an exogenous parameter. This parameter changes during the simulation. Furthermore, the system is corrupted with high measurement noise. The simulations show that our new filter achieves better performance than the standard Constrained SIR filter, while it preserves low computational complexity.

...read moreread less

Proceedings Article•DOI•

Optimal gait switching for legged locomotion

[...]

Bart Kersbergen¹, Gabriel A. D. Lopes¹, T.J.J. van den Boom¹, B. De Schutter¹, Robert Babuska¹ - Show less +1 more•Institutions (1)

Delft University of Technology¹

05 Dec 2011

TL;DR: This paper presents an intrinsically safe gait switching generator that minimizes the velocity variance of all the legs in stance, allowing for smooth acceleration in legged robots.

...read moreread less

Abstract: Switching gaits in many-legged robots can present challenges due to the combinatorial nature of the gait space. In this paper we present an intrinsically safe gait switching generator that minimizes the velocity variance of all the legs in stance, allowing for smooth acceleration in legged robots. The gait switching generator is modeled as a max-plus linear discrete event system which is translated to continuous time via a reference trajectory generator.

...read moreread less

Proceedings Article•DOI•

On the eigenstructure of a class of max-plus linear systems

[...]

Gabriel A. D. Lopes¹, Bart Kersbergen¹, T.J.J. van den Boom¹, B. De Schutter¹, Robert Babuska¹ - Show less +1 more•Institutions (1)

Delft University of Technology¹

01 Dec 2011

TL;DR: For a class of concurrent two-state cyclic systems, with direct application to legged locomotion, closed-form expressions for the max-plus eigenvalue and eigenvector of the system matrix are presented.

...read moreread less

Abstract: Various applications in scheduling, such as train timetables and multi-legged locomotion, can be modeled using systems of max-plus linear equations. In this framework, the eigenvalue of the system matrix represents the total cycle time, whereas the eigenvector dictates the steady-state behavior. For a class of concurrent two-state cyclic systems, with direct application to legged locomotion, we present closed-form expressions for the max-plus eigenvalue and eigenvector of the system matrix. Additionally, we probe into the transient properties of this class of max-plus linear systems by computing the coupling time.

...read moreread less

Journal Article•DOI•

Stability analysis and observer design for string-connected TS systems

[...]

Zs. Lendek¹, Zs. Lendek², Robert Babuska¹, B. De Schutter¹•Institutions (2)

Delft University of Technology¹, Technical University of Cluj-Napoca²

01 Jan 2011-IFAC Proceedings Volumes

TL;DR: These conditions for the distributed stability analysis of Takagi-Sugeno fuzzy systems connected in a string are proposed and extended to observer design and are illustrated on a simulation example.

...read moreread less

Journal Article•DOI•

Erratum to “Adaptive observers for TS fuzzy systems with unknown polynomial inputs” Fuzzy Sets and Systems 161 (2010) 2043--2065]

[...]

Zs. Lendek¹, Jimmy Lauber², Thierry Marie Guerra², Robert Babuska¹, B. De Schutter¹ - Show less +1 more•Institutions (2)

Delft University of Technology¹, University of Valenciennes and Hainaut-Cambresis²

01 May 2011-Fuzzy Sets and Systems

TL;DR: The description of the error dynamics in this paper contains an omission that leads to some bounds used in the conditions of Theorem 8 and Corollary 2 in the paper to be incorrectly defined.

...read moreread less

Journal Article•DOI•

Efficient Knowledge Transfer in Shaping Reinforcement Learning

[...]

Sholeh Norouzzadeh¹, Lucian Busoniu¹, Robert Babuska¹•Institutions (1)

Delft University of Technology¹

01 Jan 2011-IFAC Proceedings Volumes

TL;DR: This paper considers the essential decision on when to transfer learning from an easier task to a more difficult one, so that the total learning time is reduced and proposes two transfer criteria based on the agent's performance.

...read moreread less

Journal Article•DOI•

Convergence Analysis of Ant Colony Learning

[...]

Jelmer van Ast¹, Robert Babuska¹, Bart De Schutter¹•Institutions (1)

Delft University of Technology¹

01 Jan 2011-IFAC Proceedings Volumes

TL;DR: Upper and lower bounds for the pheromone levels are derived and related to the learning parameters and the number of ants used in the algorithm and also on the expected value of the phersomone Levels are derived.

...read moreread less