Showing papers by "Robert Babuska published in 2008"

PDF

Open Access

Journal Article•DOI•

A Comprehensive Survey of Multiagent Reinforcement Learning

[...]

Lucian Busoniu¹, Robert Babuska¹, B. De Schutter•Institutions (1)

01 Mar 2008

TL;DR: The benefits and challenges of MARL are described along with some of the problem domains where the MARL techniques have been applied, and an outlook for the field is provided.

...read moreread less

Abstract: Multiagent systems are rapidly finding applications in a variety of domains, including robotics, distributed control, telecommunications, and economics. The complexity of many tasks arising in these domains makes them difficult to solve with preprogrammed agent behaviors. The agents must, instead, discover a solution on their own, using learning. A significant part of the research on multiagent learning concerns reinforcement learning techniques. This paper provides a comprehensive survey of multiagent reinforcement learning (MARL). A central issue in the field is the formal statement of the multiagent learning goal. Different viewpoints on this issue have led to the proposal of many different goals, among which two focal points can be distinguished: stability of the agents' learning dynamics, and adaptation to the changing behavior of the other agents. The MARL algorithms described in the literature aim---either explicitly or implicitly---at one of these two goals or at a combination of both, in a fully cooperative, fully competitive, or more general setting. A representative selection of these algorithms is discussed in detail in this paper, together with the specific issues that arise in each category. Additionally, the benefits and challenges of MARL are described along with some of the problem domains where the MARL techniques have been applied. Finally, an outlook for the field is provided.

...read moreread less

1,878 citations

Journal Article•DOI•

Distributed Controller Design Approach to Dynamic Speed Limit Control Against Shockwaves on Freeways

[...]

Andrey P. Popov¹, Andreas Hegyi², Robert Babuska², Herbert Werner¹•Institutions (2)

Hamburg University of Technology¹, Delft University of Technology²

01 Jan 2008-Transportation Research Record

TL;DR: In this article, a distributed speed limit control approach based on a distributed controller design technique was developed to eliminate short traffic jams that emerge at bottlenecks and travel in the upstream direction on the freeway.

...read moreread less

Abstract: Dynamic speed limits can be used to eliminate shockwaves on freeways. Shockwaves are typically short traffic jams that emerge at bottlenecks and travel in the upstream direction on the freeway. These shockwaves lead to increased travel times and possibly to unsafe situations. A speed limit control approach to resolving shockwaves was developed based on a distributed controller design technique. The controller is distributed in the sense that each speed limit sign has its own controller. The controller parameters are optimized by numerical optimization, assuming that the controller structure and parameters are the same for each controller. The resulting performances are compared for several designs, differing in the controller order and the extent that the upstream and downstream traffic states are used as inputs for the controller. Other controllers known from the literature are based on switching schemes using local information only or are centralized model-based controllers with high computational loads...

...read moreread less

41 citations

Journal Article•DOI•

Adaptive fuzzy control of a non-linear servo-drive: Theory and experimental results

[...]

Domenico Bellomo¹, David Naso², Robert Babuska¹•Institutions (2)

Delft University of Technology¹, Instituto Politécnico Nacional²

01 Sep 2008-Engineering Applications of Artificial Intelligence

TL;DR: Experimental results demonstrate that the proposed method improves the controller's tracking performance and Parametric and structural changes are introduced to the controlled plant, in order to emphasize the advantages and limitations of the considered adaptive controllers.

...read moreread less

40 citations

Journal Article•DOI•

Control of the fluidised bed in the pellet softening process

[...]

Kim van Schagen¹, Luuk C. Rietveld¹, Robert Babuska¹, Eric Baars•Institutions (1)

Delft University of Technology¹

01 Mar 2008-Chemical Engineering Science

TL;DR: In this article, a particle filter, based on a first-principles model, estimates the state of the softening reactor and a nonlinear model-predictive controller determines the values of the manipulated variables.

...read moreread less

39 citations

Journal Article•DOI•

Distributed Kalman filtering for cascaded systems

[...]

Zsofia Lendek¹, Robert Babuska¹, B. De Schutter¹•Institutions (1)

Delft University of Technology¹

01 Apr 2008-Engineering Applications of Artificial Intelligence

TL;DR: The decomposition of a linear process model into a cascade of simpler subsystems and the use of a Kalman filter to individually estimate the states of these subsystems is proposed and the performance achieved by the cascaded observers is comparable and in certain cases even better than the performance of the centralized observer.

...read moreread less

35 citations

Journal Article•DOI•

Particle Swarms in Optimization and Control

[...]

Jelmer van Ast, Robert Babuska, Bart De Schutter¹•Institutions (1)

Delft University of Technology¹

01 Jan 2008-IFAC Proceedings Volumes

TL;DR: The flexibility, scalability, and robustness to errors on a local level are intrinsic properties of swarms that have attracted the interest of researchers in applying swarm technology to various problems.

...read moreread less

27 citations

Journal Article•DOI•

Model-based operational constraints for fluidised bed crystallisation

[...]

K. M. van Schagen, Luuk C. Rietveld¹, Robert Babuska¹, O.J.I. Kramer•Institutions (1)

Delft University of Technology¹

01 Jan 2008-Water Research

TL;DR: The current operation of the treatment plant of Waternet violates the calculated constraints with consequences for effluent quality and corrective maintenance, and the softening process can thus be improved.

...read moreread less

23 citations

Journal Article•DOI•

Distributed Controller Design for Dynamic Speed Limit Control Against Shock Waves on Freeways

[...]

Andrey P. Popov¹, Robert Babuska², Andreas Hegyi², Herbert Werner¹•Institutions (2)

Hamburg University of Technology¹, Delft University of Technology²

01 Jan 2008-IFAC Proceedings Volumes

TL;DR: In this article, a decentralized feedback controller with a fixed structure was proposed to solve the problem of short traffic jams and reduce the total time spent by 20% compared to the uncontrolled case.

...read moreread less

22 citations

Journal Article•DOI•

A semi-supervised method to detect seismic random noise with fuzzy GK clustering

[...]

Hosein Hashemi¹, Hosein Hashemi², Abdolrahim Javaherian¹, Robert Babuska²•Institutions (2)

University of Tehran¹, Delft University of Technology²

01 Dec 2008-Journal of Geophysics and Engineering

TL;DR: In this article, a new method to detect random noise in seismic data using fuzzy Gustafson-Kessel (GK) clustering is presented. But the method is not suitable for the detection of seismic events and random noise.

...read moreread less

Abstract: We present a new method to detect random noise in seismic data using fuzzy Gustafson–Kessel (GK) clustering. First, using an adaptive distance norm, a matrix is constructed from the observed seismic amplitudes. The next step is to find centres of ellipsoidal clusters and construct a partition matrix which determines the soft decision boundaries between seismic events and random noise. The GK algorithm updates the cluster centres in order to iteratively minimize the cluster variance. Multiplication of the fuzzy membership function with values of each sample yields new sections; we name them 'clustered sections'. The seismic amplitude values of the clustered sections are given in a way to decrease the level of noise in the original noisy seismic input. In pre-stack data, it is essential to study the clustered sections in a f–k domain; finding the quantitative index for weighting the post-stack data needs a similar approach. Using the knowledge of a human specialist together with the fuzzy unsupervised clustering, the method is a semi-supervised random noise detection. The efficiency of this method is investigated on synthetic and real seismic data for both pre- and post-stack data. The results show a significant improvement of the input noisy sections without harming the important amplitude and phase information of the original data. The procedure for finding the final weights of each clustered section should be carefully done in order to keep almost all the evident seismic amplitudes in the output section. The method interactively uses the knowledge of the seismic specialist in detecting the noise.

...read moreread less

22 citations

Journal Article•DOI•

Adaptive Friction Compensation: Application to a Robotic Manipulator

[...]

Witono Susanto¹, Robert Babuska¹, Freek Liefhebber, Ton van der Weiden¹•Institutions (1)

Delft University of Technology¹

01 Jan 2008-IFAC Proceedings Volumes

TL;DR: In this article, a feed-forward model-based friction compensation technique using the LuGre friction model is presented, where an off-line method is given to estimate the model's parameters based on simple ramp-response experiments.

...read moreread less

19 citations

Proceedings Article•DOI•

Ant Colony Optimization for optimal control

[...]

J. van Ast¹, Robert Babuska¹, B. De Schutter¹•Institutions (1)

Delft University of Technology¹

01 Jun 2008

TL;DR: An ACO approach to optimal control is proposed, which requires that a continuous-time, continuous-state model of the system, together with a finite action set, is formulated as a discrete, non-deterministic automaton.

...read moreread less

Abstract: Ant Colony Optimization (ACO) has proven to be a very powerful optimization heuristic for Combinatorial Optimization Problems (COPs). It has been demonstrated to work well when applied to various NP-complete problems, such as the traveling salesman problem. In this paper, an ACO approach to optimal control is proposed. This approach requires that a continuous-time, continuous-state model of the system, together with a finite action set, is formulated as a discrete, non-deterministic automaton. The control problem is then translated into a stochastic COP. This method is applied to the time-optimal swing-up and stabilization of a pendulum.

...read moreread less

Journal Article•DOI•

Fuzzy Partition Optimization for Approximate Fuzzy Q-Iteration

[...]

Lucian Busoniu¹, Damien Ernst², Bart De Schutter¹, Robert Babuska¹•Institutions (2)

Delft University of Technology¹, University of Liège²

01 Jan 2008-IFAC Proceedings Volumes

TL;DR: This paper proposes a technique to optimize the shape of a constant number of basis functions for the approximate, fuzzy Q-iteration algorithm, and measures the actual performance of the computed policies in the task, using simulation from a representative set of initial states.

...read moreread less

Proceedings Article•DOI•

Stability analysis and observer design for decentralized TS fuzzy systems

[...]

Zsofia Lendek¹, Robert Babuska¹, B. De Schutter¹•Institutions (1)

Delft University of Technology¹

01 Jun 2008

TL;DR: This work analyzes the stability of the overall TS system based on the Stability of the subsystems and the strength of the interconnection terms, and proposes a decentralized approach to observer design.

...read moreread less

Abstract: A large class of nonlinear systems can be well approximated by Takagi-Sugeno (TS) fuzzy models, with linear or affine consequents. It is well-known that the stability of these consequent models does not ensure the stability of the overall fuzzy system. Stability conditions developed for TS fuzzy systems in general rely on the feasibility of an associated system of linear matrix inequalities, whose complexity may grow exponentially with the number of rules. We study distributed systems, where the subsystems are represented as TS fuzzy models. For such systems, a centralized analysis is often unfeasible. We analyze the stability of the overall TS system based on the stability of the subsystems and the strength of the interconnection terms. For naturally distributed applications, such as multi-agent systems, when adding new subsystems ldquoon-linerdquo, the construction and tuning of a centralized observer is often intractable. Therefore, we also propose a decentralized approach to observer design. Applications of such systems include distributed process control, traffic networks, and economic systems.

...read moreread less

Journal Article•DOI•

Reinforcement Learning for Elevator Control

[...]

Xu Yuan¹, Lucian Busoniu², Robert Babuska²•Institutions (2)

ASML Holding¹, Delft University of Technology²

01 Jan 2008-IFAC Proceedings Volumes

TL;DR: The mathematical model of the elevator system is described in detail, making the system easy to re-implement and re-use, and an experimental comparison is made between the performance of the Q-value iteration and Q-learning RL algorithms, when applied to the elevator system.

...read moreread less

Journal Article•DOI•

Decentralized Estimation of Overflow Losses in a Hopper-Dredger

[...]

Zs. Lendek¹, Robert Babuska¹, J. Braaksma¹, C. de Keizer•Institutions (1)

Delft University of Technology¹

01 Apr 2008-Control Engineering Practice

TL;DR: In this paper, a decomposition of the nonlinear process model into two simpler subsystems is proposed, and a different type of observer is considered for each subsystem, i.e., a particle filter and an unscented Kalman filter.

...read moreread less

Proceedings Article•DOI•

A general modeling framework for swarms

[...]

J. van Ast¹, Robert Babuska¹, B. De Schutter¹•Institutions (1)

Delft University of Technology¹

01 Jun 2008

TL;DR: A general comprehensive swarm framework is introduced and related to the established state of the art, which is a first and important step in the development and analysis of more complex and intelligent swarms.

...read moreread less

Abstract: Swarms are characterized by the ability to generate complex behavior from the coupling of simple individuals. While the swarm approach to distributed systems of moving agents is gradually finding a way to engineering applications, a true successful demonstration of an engineered swarm is still missing. One of the reasons for this is the gap between the complexity of the swarms studied in fundamental research and the complexity needed for the application to interesting control problems. In the majority of the research on swarm intelligent systems, the moving agents in the swarm are modeled as simple reactive agents. This model comprises too little intelligence to fully exploit the potential of swarms. In this paper, a general comprehensive swarm framework is introduced and related to the established state of the art. Such a framework is novel and it is a first and important step in the development and analysis of more complex and intelligent swarms.

...read moreread less

Proceedings Article•DOI•

Consistency of fuzzy model-based reinforcement learning

[...]

Lucian Busoniu¹, Damien Ernst², B. De Schutter¹, Robert Babuska¹•Institutions (2)

Delft University of Technology¹, University of Liège²

01 Jun 2008

TL;DR: An approximate, model-based Q-iteration algorithm that relies on a fuzzy partition of the state space, and on a discretization of the action space to show that the resulting algorithm is consistent, i.e., that the optimal solution is obtained asymptotically as the approximation accuracy increases.

...read moreread less

Abstract: Reinforcement learning (RL) is a widely used paradigm for learning control. Computing exact RL solutions is generally only possible when process states and control actions take values in a small discrete set. In practice, approximate algorithms are necessary. In this paper, we propose an approximate, model-based Q-iteration algorithm that relies on a fuzzy partition of the state space, and on a discretization of the action space. Using assumptions on the continuity of the dynamics and of the reward function, we show that the resulting algorithm is consistent, i.e., that the optimal solution is obtained asymptotically as the approximation accuracy increases. An experimental study indicates that a continuous reward function is also important for a predictable improvement in performance as the approximation accuracy increases.

...read moreread less

Journal Article•DOI•

Cascaded parameter estimation for a water treatment plant using particle filters

[...]

Zs. Lendek¹, K. M. van Schagen¹, Robert Babuska¹, A.M.J. Veersma, B. De Schutter¹ - Show less +1 more•Institutions (1)

Delft University of Technology¹

01 Jan 2008-IFAC Proceedings Volumes

TL;DR: In this paper, a particle filter is used to improve the accuracy of pH quality measurements in a water treatment plant. But the performance of the particle filter was evaluated both for simulated and real-world data.

...read moreread less