scispace - formally typeset
Search or ask a question
Author

Robert Babuska

Bio: Robert Babuska is an academic researcher from Delft University of Technology. The author has contributed to research in topics: Fuzzy logic & Reinforcement learning. The author has an hindex of 56, co-authored 371 publications receiving 15388 citations. Previous affiliations of Robert Babuska include Carnegie Mellon University & Czech Technical University in Prague.


Papers
More filters
Proceedings ArticleDOI
26 May 2015
TL;DR: An approximate analytical solution to the otherwise non-integrable double-stance dynamics of the bipedal spring-loaded inverted pendulum (SLIP) is introduced and a control application based on this solution is presented.
Abstract: This paper introduces an approximate analytical solution to the otherwise non-integrable double-stance dynamics of the bipedal spring-loaded inverted pendulum (SLIP). Despite the apparent structural simplicity of the SLIP, the exact analytical solution to its stance dynamics cannot be found. Approximate maps have been proposed for the monoped SLIP runner (encompassing a single-stance phase). Still, even in an approximate form, a solution to the double-stance dynamics of the bipedal SLIP walker remained an open problem. We propose a double-stance map that can be readily utilized especially in the design of control systems for active dynamic walking. The accuracy of the derived map over a feasible range of locomotion properties is analyzed numerically, and a control application based on this solution is presented. Simulations for an arbitrary chosen energy level reveals that the devised controller enlarges the stable walking domain of the standard SLIP considerably.

11 citations

Proceedings ArticleDOI
01 Jun 2016
TL;DR: A novel multivariable control strategy based on PI auto-tuning is proposed by combining the aforementioned model with optimization of the desired (time-varying) equilibria and zone setpoint temperature, which can lead to important energy savings.
Abstract: The field of energy efficiency in buildings offers challenging opportunities from a control point of view. Heating, Ventilation and Air-Conditioning (HVAC) units in buildings must be accurately controlled so as to ensure the occupants' comfort and reduced energy consumption. While the existing HVAC models consist of only one or a few HVAC components, this work involves the development of a complete HVAC model for one thermal zone. Also, a novel multivariable control strategy based on PI auto-tuning is proposed by combining the aforementioned model with optimization of the desired (time-varying) equilibria. One of the advantages of the proposed PI strategy is the use of time-varying input equilibria and zone setpoint temperature, which can lead to important energy savings. A comparison with a baseline control strategy with constant setpoint temperature is presented: the comparison results show good tracking performance and improved energy efficiency in terms of HVAC energy consumption.

11 citations

Proceedings ArticleDOI
01 Dec 2014
TL;DR: The results show that the learning module can rapidly augment the designed sequential composition by new control policies such that the supervisor could handle unpredicted situations online.
Abstract: Sequential composition is an effective approach to address the control of complex dynamical systems. However, it is not designed to cope with unforeseen situations that might occur during runtime. This paper extends sequential composition control via learning new policies. A learning module based on reinforcement learning is added to the traditional sequential composition that allows for the online creation of new control policies in a short amount of time, on a need basis. During learning, the domain of attraction (DOA) of the new control policy is continuously monitored. Hence, the learning process only executes until the supervisor is able to compose the new control policy with designed controllers via the overlap of DOAs. Estimating the DOAs of the learned controllers is achieved by solving an optimization problem. The proposed strategy has been simulated on a nonlinear system. The results show that the learning module can rapidly augment the designed sequential composition by new control policies such that the supervisor could handle unpredicted situations online.

11 citations

Proceedings ArticleDOI
23 Jul 2007
TL;DR: This paper proposes a fuzzy approximation structure for the Q-value iteration algorithm, and shows that the resulting algorithm is convergent, and proposes a modified, serial version of the algorithm that is guaranteed to converge at least as fast as the original algorithm.
Abstract: Reinforcement learning (RL) is a learning control paradigm that provides well-understood algorithms with good convergence and consistency properties. Unfortunately, these algorithms require that process states and control actions take only discrete values. Approximate solutions using fuzzy representations have been proposed in the literature for the case when the states and possibly the actions are continuous. However, the link between these mainly heuristic solutions and the larger body of work on approximate RL, including convergence results, has not been made explicit. In this paper, we propose a fuzzy approximation structure for the Q-value iteration algorithm, and show that the resulting algorithm is convergent. The proof is based on an extension of previous results in approximate RL. We then propose a modified, serial version of the algorithm that is guaranteed to converge at least as fast as the original algorithm. An illustrative simulation example is also provided.

11 citations

Book ChapterDOI
01 Jan 1999
TL;DR: Fuzzy sets, the foundation of fuzzy control, were introduced thirty years ago as a way of expressing non-probabilistic uncertainties and found applications in database management, operations analysis, decision support systems, signal processing, data classifications, computer vision, etc.
Abstract: Fuzzy sets, the foundation of fuzzy control, were introduced thirty years ago, (Zadeh, 1965), as a way of expressing non-probabilistic uncertainties. Since then, fuzzy set theory has developed and found applications in database management, operations analysis, decision support systems, signal processing, data classifications, computer vision, etc. The application area that has attracted most attention is, however, control. In 1974, the first successful application of fuzzy logic to control was reported (Mamdani, 1974). Control of cement kilns was an early industrial application (Holmblad and Ostergaard, 1982). Since the first consumer product using fuzzy logic was marketed in 1987, the use of fuzzy control has increased substantially. A number of CAD environments for fuzzy control design have emerged together with VLSI hardware for fast execution. Fuzzy control is being applied industrially in an increasing number of cases, e.g., (Froese, 1993; Hellendoorn, 1993; Bonissone, 1994; Hirota, 1993; Terano et al., 1994).

11 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: This historical survey compactly summarizes relevant work, much of it from the previous millennium, review deep supervised learning, unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks.

14,635 citations

Journal ArticleDOI
TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

13,246 citations

01 Apr 2003
TL;DR: The EnKF has a large user group, and numerous publications have discussed applications and theoretical aspects of it as mentioned in this paper, and also presents new ideas and alternative interpretations which further explain the success of the EnkF.
Abstract: The purpose of this paper is to provide a comprehensive presentation and interpretation of the Ensemble Kalman Filter (EnKF) and its numerical implementation. The EnKF has a large user group, and numerous publications have discussed applications and theoretical aspects of it. This paper reviews the important results from these studies and also presents new ideas and alternative interpretations which further explain the success of the EnKF. In addition to providing the theoretical framework needed for using the EnKF, there is also a focus on the algorithmic formulation and optimal numerical implementation. A program listing is given for some of the key subroutines. The paper also touches upon specific issues such as the use of nonlinear measurements, in situ profiles of temperature and salinity, and data which are available with high frequency in time. An ensemble based optimal interpolation (EnOI) scheme is presented as a cost-effective approach which may serve as an alternative to the EnKF in some applications. A fairly extensive discussion is devoted to the use of time correlated model errors and the estimation of model bias.

2,975 citations

Journal ArticleDOI
TL;DR: This article attempts to strengthen the links between the two research communities by providing a survey of work in reinforcement learning for behavior generation in robots by highlighting both key challenges in robot reinforcement learning as well as notable successes.
Abstract: Reinforcement learning offers to robotics a framework and set of tools for the design of sophisticated and hard-to-engineer behaviors. Conversely, the challenges of robotic problems provide both inspiration, impact, and validation for developments in reinforcement learning. The relationship between disciplines has sufficient promise to be likened to that between physics and mathematics. In this article, we attempt to strengthen the links between the two research communities by providing a survey of work in reinforcement learning for behavior generation in robots. We highlight both key challenges in robot reinforcement learning as well as notable successes. We discuss how contributions tamed the complexity of the domain and study the role of algorithms, representations, and prior knowledge in achieving these successes. As a result, a particular focus of our paper lies on the choice between model-based and model-free as well as between value-function-based and policy-search methods. By analyzing a simple problem in some detail we demonstrate how reinforcement learning approaches may be profitably applied, and we note throughout open questions and the tremendous potential for future research.

2,391 citations