scispace - formally typeset
Search or ask a question
Author

Robert Babuska

Bio: Robert Babuska is an academic researcher from Delft University of Technology. The author has contributed to research in topics: Fuzzy logic & Reinforcement learning. The author has an hindex of 56, co-authored 371 publications receiving 15388 citations. Previous affiliations of Robert Babuska include Carnegie Mellon University & Czech Technical University in Prague.


Papers
More filters
Journal ArticleDOI
TL;DR: This work shows that fuzzy Q-iteration is consistent, i.e., that it asymptotically obtains the optimal solution as the approximation accuracy increases, and proves that the asynchronous algorithm is proven to converge at least as fast as the synchronous one.

62 citations

Proceedings ArticleDOI
03 Dec 2010
TL;DR: This work presents two novel temporal difference learning algorithms for problems with control delay that improve learning performance by taking the control delay into account and outperform classical TD learning algorithms while maintaining low computational complexity.
Abstract: Robots controlled by Reinforcement Learning (RL) are still rare A core challenge to the application of RL to robotic systems is to learn despite the existence of control delay - the delay between measuring a system's state and acting upon it Control delay is always present in real systems In this work, we present two novel temporal difference (TD) learning algorithms for problems with control delay These algorithms improve learning performance by taking the control delay into account We test our algorithms in a gridworld, where the delay is an integer multiple of the time step, as well as in the simulation of a robotic system, where the delay can have any value In both tests, our proposed algorithms outperform classical TD learning algorithms, while maintaining low computational complexity

62 citations

Journal ArticleDOI
TL;DR: A comprehensive review of the current learning and adaptive control methodologies that have been adapted specifically to PH systems, and highlights the changes from the general setting due to PH model, followed by a detailed presentation of the respective control algorithm.
Abstract: Port-Hamiltonian (PH) theory is a novel, but well established modeling framework for nonlinear physical systems. Due to the emphasis on the physical structure and modular framework, PH modeling has become a prime focus in system theory. This has led to a considerable research interest in the control of PH systems, resulting in numerous nonlinear control techniques. General nonlinear control methodologies are classified in a spectrum from model-based to model-free, where adaptation and learning typically lie close to the end of the range. Various articles and monographs have provided a detailed overview of model-based control techniques on PH models, but no survey is specifically dedicated to the learning and adaptive control methods that can benefit from the PH structure. To this end, we provide a comprehensive review of the current learning and adaptive control methodologies that have been adapted specifically to PH systems. After establishing the required theoretical background, we elaborate on various general machine learning, iterative learning, and adaptive control techniques and their application to PH systems. For each method we highlight the changes from the general setting due to PH model, followed by a detailed presentation of the respective control algorithm. In general, the advantages of using PH models in learning and adaptive controllers are: i) Prior knowledge in the form of PH model speeds up the learning. ii) In some instances new stability or convergence guarantees are obtained by having a PH model. iii) The resulting control laws can be interpreted in the context of physical systems. We conclude the paper with notes on open research issues.

60 citations

01 Jan 1998
TL;DR: Fuzzy modeling of dynamic systems is addressed, as well as the methods to construct fuzzy models from knowledge and data (measurements) and some engineering applications of fuzzy modeling are reviewed.
Abstract: This text provides an introduction to the use of fuzzy sets and fuzzy logic for the approximation of functions and modeling of static and dynamic systems. The concept of a fuzzy system is first explained. Afterwards, the motivation and practical relevance of fuzzy modeling are highlighted. Two types of rule-based fuzzy models are described: the linguistic (Mamdani) model and the Takagi–Sugeno model. For each model, the structure of the rules, the inference and defuzzification methods are presented. Fuzzy modeling of dynamic systems is addressed, as well as the methods to construct fuzzy models from knowledge and data (measurements). Illustrative examples are given throughout the text. At the end, homework problems are included. MATLAB programs implementing some of the examples are available from the author. The reader is encouraged to study and possibly modify these examples in order to get a better insight in the methods presented. Preface Prerequisites: This text provides an introduction to the use of fuzzy sets and fuzzy logic for the approximation of functions and modeling of static and dynamic systems. It is assumed that the reader has basic knowledge of set and fuzzy set theory (membership functions, operations on fuzzy sets – union, intersection and complement, fuzzy relations, max-min composition, extension principle), mathematical analysis (univariate and multivariate functions, composition of functions), and linear algebra (system of linear equations, least-square solution). Organization. The material is organized in five sections: In the Introduction, different modeling paradigms are first presented. Then, the concept of a fuzzy system is first explained and the motivation and practical relevance of fuzzy modeling are highlighted. Section 2 describes two types of rule-based fuzzy models: the linguistic (Mamdani) model and the Takagi–Sugeno model. For each model, the structure of the rules, the inference and defuzzification methods are presented. At the end of this section, fuzzy modeling of dynamic systems is addressed. In Section 3, methods to construct fuzzy models from knowledge and numerical data are presented. Section 4 reviews some engineering applications of fuzzy modeling, and the concluding Section 5 gives a short summary. Illustrative examples are provided throughout the text, and at the end, homework problems are included. Some of the numerical examples given have been implemented in MATLAB. The code is available from the author on request. The reader is encouraged to study and possibly modify these examples in order to get a better insight in the methods presented. A subject index …

60 citations

Proceedings ArticleDOI
11 Apr 2011
TL;DR: An overview of methods for approximate RL, starting from their dynamic programming roots and organizing them into three major classes: approximate value iteration, policy iteration, and policy search, which compares the different categories of methods and outlines possible ways to enhance the reviewed algorithms.
Abstract: Reinforcement learning (RL) allows agents to learn how to optimally interact with complex environments. Fueled by recent advances in approximation-based algorithms, RL has obtained impressive successes in robotics, artificial intelligence, control, operations research, etc. However, the scarcity of survey papers about approximate RL makes it difficult for newcomers to grasp this intricate field. With the present overview, we take a step toward alleviating this situation. We review methods for approximate RL, starting from their dynamic programming roots and organizing them into three major classes: approximate value iteration, policy iteration, and policy search. Each class is subdivided into representative categories, highlighting among others offline and online algorithms, policy gradient methods, and simulation-based techniques. We also compare the different categories of methods, and outline possible ways to enhance the reviewed algorithms.

59 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: This historical survey compactly summarizes relevant work, much of it from the previous millennium, review deep supervised learning, unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks.

14,635 citations

Journal ArticleDOI
TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

13,246 citations

01 Apr 2003
TL;DR: The EnKF has a large user group, and numerous publications have discussed applications and theoretical aspects of it as mentioned in this paper, and also presents new ideas and alternative interpretations which further explain the success of the EnkF.
Abstract: The purpose of this paper is to provide a comprehensive presentation and interpretation of the Ensemble Kalman Filter (EnKF) and its numerical implementation. The EnKF has a large user group, and numerous publications have discussed applications and theoretical aspects of it. This paper reviews the important results from these studies and also presents new ideas and alternative interpretations which further explain the success of the EnKF. In addition to providing the theoretical framework needed for using the EnKF, there is also a focus on the algorithmic formulation and optimal numerical implementation. A program listing is given for some of the key subroutines. The paper also touches upon specific issues such as the use of nonlinear measurements, in situ profiles of temperature and salinity, and data which are available with high frequency in time. An ensemble based optimal interpolation (EnOI) scheme is presented as a cost-effective approach which may serve as an alternative to the EnKF in some applications. A fairly extensive discussion is devoted to the use of time correlated model errors and the estimation of model bias.

2,975 citations

Journal ArticleDOI
TL;DR: This article attempts to strengthen the links between the two research communities by providing a survey of work in reinforcement learning for behavior generation in robots by highlighting both key challenges in robot reinforcement learning as well as notable successes.
Abstract: Reinforcement learning offers to robotics a framework and set of tools for the design of sophisticated and hard-to-engineer behaviors. Conversely, the challenges of robotic problems provide both inspiration, impact, and validation for developments in reinforcement learning. The relationship between disciplines has sufficient promise to be likened to that between physics and mathematics. In this article, we attempt to strengthen the links between the two research communities by providing a survey of work in reinforcement learning for behavior generation in robots. We highlight both key challenges in robot reinforcement learning as well as notable successes. We discuss how contributions tamed the complexity of the domain and study the role of algorithms, representations, and prior knowledge in achieving these successes. As a result, a particular focus of our paper lies on the choice between model-based and model-free as well as between value-function-based and policy-search methods. By analyzing a simple problem in some detail we demonstrate how reinforcement learning approaches may be profitably applied, and we note throughout open questions and the tremendous potential for future research.

2,391 citations