scispace - formally typeset
Search or ask a question
Author

Howard M. Schwartz

Bio: Howard M. Schwartz is an academic researcher from Carleton University. The author has contributed to research in topics: Reinforcement learning & Adaptive control. The author has an hindex of 18, co-authored 121 publications receiving 1292 citations.


Papers
More filters
Book
11 Aug 2014
TL;DR: This paper presents a meta-modelling framework for committee machine based on simulation and investigation of multi-agent reinforcement hierarchical reinforcement learning inMulti-agent environment learning to cooperate inmulti-agent systems by combining traf?c light control by multiagent reinforcement learning multi- agent relational reinforcement learning cooperative reinforcement learning.
Abstract: The book begins with a chapter on traditional methods of supervised learning, covering recursive least squares learning, mean square error methods, and stochastic approximation. Chapter 2 covers single agent reinforcement learning. Topics include learning value functions, Markov games, and TD learning with eligibility traces. Chapter 3 discusses two player games including two player matrix games with both pure and mixed strategies. Numerous algorithms and examples are presented. Chapter 4 covers learning in multi-player games, stochastic games, and Markov games, focusing on learning multi-player grid games-two player grid games, Q-learning, and Nash Q-learning. Chapter 5 discusses differential games, including multi player differential games, actor critique structure, adaptive fuzzy control and fuzzy interference systems, the evader pursuit game, and the defending a territory games. Chapter 6 discusses new ideas on learning within robotic swarms and the innovative idea of the evolution of personality traits. Framework for understanding a variety of methods and approaches in multi-agent machine learning; Discusses methods of reinforcement learning such as a number of forms of multi-agent Q-learning; Applicable to research professors and graduate students studying electrical and computer engineering, computer science, and mechanical and aerospace engineering.

93 citations

01 Jan 2008
TL;DR: The subject of the current paper is to review the fundamental physical properties of crystal oscillators and determine all significant frequency perturbing stimuli and determine the overall synchronization accuracy achievable by the system.
Abstract: Quartz crystal based oscillators are used as clock sources in the synchronization and syntonization of distributed systems to a common time or frequency scale. One such system is that of a cellular network in which base station transceivers are operated within a specified time or frequency accuracy with reference to a system reference. The accuracy of the entrainment of the distributed clocks to the reference clock is subject to the design of the servo control system. In the event the servo fails the slave clock accuracy is a function of the local environmental and electrical stimuli applied to the clock. As loss of the servo signal is a practical issue in a real system, this ultimate system entrainment accuracy is dependent on the accuracy with which the free running clocks can be corrected. It is the subject of the current paper to review the fundamental physical properties of crystal oscillators and in so doing determine all significant frequency perturbing stimuli. Identification and quantification of these stimuli in terms of analytical expressions is the first stage in the creation of an accurate clock model suitable for compensation of the clock in the absence of the servo signal from the reference. Thus a fundamental understanding of the parameters affecting the clock drift becomes paramount to determining the overall synchronization accuracy achievable by the system.

73 citations

Journal ArticleDOI
TL;DR: A novel algorithm for directional forgetting is proposed based on a matrix decomposition method, which performs exponential forgetting according to the direction of the data vector, thus preventing the problem known as estimator windup.

63 citations

Journal ArticleDOI
TL;DR: A new systematic approach to characterize and compute the stability bound of a singularly perturbed linear system based on the feedback system representation of an additional matrix perturbation problem is provided.
Abstract: In this note, we will provide a new systematic approach to characterize and compute the stability bound of a singularly perturbed linear system. The approach is based on the feedback system representation of an additional matrix perturbation problem. The idea is to change the stability bound problem to the stability problem of an underlying feedback system. This approach allows multiple choices in formulating the underlying feedback system and thus has the potential of characterizing and computing the stability bound in a number of different ways. By formulating two kinds of different feedback systems, some existing and new results on the stability bounds are derived based on the feedback system approach. The new results complement the existing frequency-domain based stability criteria and make the frequency-domain technique more applicable and useful to the stability bound problem. An example is provided to show the new stability criterion is effective and useful in determining the stability bound.

58 citations

Journal ArticleDOI
TL;DR: New analysis on some fundamental properties of the Kalman filter based parameter estimation algorithms using an orthogonal decomposition approach based on the excited subspace and two kinds of directional tracking algorithms are proposed.

51 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

13,246 citations

Book
01 Jan 2001
TL;DR: This chapter discusses Decision-Theoretic Foundations, Game Theory, Rationality, and Intelligence, and the Decision-Analytic Approach to Games, which aims to clarify the role of rationality in decision-making.
Abstract: Preface 1. Decision-Theoretic Foundations 1.1 Game Theory, Rationality, and Intelligence 1.2 Basic Concepts of Decision Theory 1.3 Axioms 1.4 The Expected-Utility Maximization Theorem 1.5 Equivalent Representations 1.6 Bayesian Conditional-Probability Systems 1.7 Limitations of the Bayesian Model 1.8 Domination 1.9 Proofs of the Domination Theorems Exercises 2. Basic Models 2.1 Games in Extensive Form 2.2 Strategic Form and the Normal Representation 2.3 Equivalence of Strategic-Form Games 2.4 Reduced Normal Representations 2.5 Elimination of Dominated Strategies 2.6 Multiagent Representations 2.7 Common Knowledge 2.8 Bayesian Games 2.9 Modeling Games with Incomplete Information Exercises 3. Equilibria of Strategic-Form Games 3.1 Domination and Ratonalizability 3.2 Nash Equilibrium 3.3 Computing Nash Equilibria 3.4 Significance of Nash Equilibria 3.5 The Focal-Point Effect 3.6 The Decision-Analytic Approach to Games 3.7 Evolution. Resistance. and Risk Dominance 3.8 Two-Person Zero-Sum Games 3.9 Bayesian Equilibria 3.10 Purification of Randomized Strategies in Equilibria 3.11 Auctions 3.12 Proof of Existence of Equilibrium 3.13 Infinite Strategy Sets Exercises 4. Sequential Equilibria of Extensive-Form Games 4.1 Mixed Strategies and Behavioral Strategies 4.2 Equilibria in Behavioral Strategies 4.3 Sequential Rationality at Information States with Positive Probability 4.4 Consistent Beliefs and Sequential Rationality at All Information States 4.5 Computing Sequential Equilibria 4.6 Subgame-Perfect Equilibria 4.7 Games with Perfect Information 4.8 Adding Chance Events with Small Probability 4.9 Forward Induction 4.10 Voting and Binary Agendas 4.11 Technical Proofs Exercises 5. Refinements of Equilibrium in Strategic Form 5.1 Introduction 5.2 Perfect Equilibria 5.3 Existence of Perfect and Sequential Equilibria 5.4 Proper Equilibria 5.5 Persistent Equilibria 5.6 Stable Sets 01 Equilibria 5.7 Generic Properties 5.8 Conclusions Exercises 6. Games with Communication 6.1 Contracts and Correlated Strategies 6.2 Correlated Equilibria 6.3 Bayesian Games with Communication 6.4 Bayesian Collective-Choice Problems and Bayesian Bargaining Problems 6.5 Trading Problems with Linear Utility 6.6 General Participation Constraints for Bayesian Games with Contracts 6.7 Sender-Receiver Games 6.8 Acceptable and Predominant Correlated Equilibria 6.9 Communication in Extensive-Form and Multistage Games Exercises Bibliographic Note 7. Repeated Games 7.1 The Repeated Prisoners Dilemma 7.2 A General Model of Repeated Garnet 7.3 Stationary Equilibria of Repeated Games with Complete State Information and Discounting 7.4 Repeated Games with Standard Information: Examples 7.5 General Feasibility Theorems for Standard Repeated Games 7.6 Finitely Repeated Games and the Role of Initial Doubt 7.7 Imperfect Observability of Moves 7.8 Repeated Wines in Large Decentralized Groups 7.9 Repeated Games with Incomplete Information 7.10 Continuous Time 7.11 Evolutionary Simulation of Repeated Games Exercises 8. Bargaining and Cooperation in Two-Person Games 8.1 Noncooperative Foundations of Cooperative Game Theory 8.2 Two-Person Bargaining Problems and the Nash Bargaining Solution 8.3 Interpersonal Comparisons of Weighted Utility 8.4 Transferable Utility 8.5 Rational Threats 8.6 Other Bargaining Solutions 8.7 An Alternating-Offer Bargaining Game 8.8 An Alternating-Offer Game with Incomplete Information 8.9 A Discrete Alternating-Offer Game 8.10 Renegotiation Exercises 9. Coalitions in Cooperative Games 9.1 Introduction to Coalitional Analysis 9.2 Characteristic Functions with Transferable Utility 9.3 The Core 9.4 The Shapkey Value 9.5 Values with Cooperation Structures 9.6 Other Solution Concepts 9.7 Colational Games with Nontransferable Utility 9.8 Cores without Transferable Utility 9.9 Values without Transferable Utility Exercises Bibliographic Note 10. Cooperation under Uncertainty 10.1 Introduction 10.2 Concepts of Efficiency 10.3 An Example 10.4 Ex Post Inefficiency and Subsequent Oilers 10.5 Computing Incentive-Efficient Mechanisms 10.6 Inscrutability and Durability 10.7 Mechanism Selection by an Informed Principal 10.8 Neutral Bargaining Solutions 10.9 Dynamic Matching Processes with Incomplete Information Exercises Bibliography Index

3,569 citations

Book ChapterDOI
01 Jan 1977
TL;DR: In the Hamadryas baboon, males are substantially larger than females, and a troop of baboons is subdivided into a number of ‘one-male groups’, consisting of one adult male and one or more females with their young.
Abstract: In the Hamadryas baboon, males are substantially larger than females. A troop of baboons is subdivided into a number of ‘one-male groups’, consisting of one adult male and one or more females with their young. The male prevents any of ‘his’ females from moving too far from him. Kummer (1971) performed the following experiment. Two males, A and B, previously unknown to each other, were placed in a large enclosure. Male A was free to move about the enclosure, but male B was shut in a small cage, from which he could observe A but not interfere. A female, unknown to both males, was then placed in the enclosure. Within 20 minutes male A had persuaded the female to accept his ownership. Male B was then released into the open enclosure. Instead of challenging male A , B avoided any contact, accepting A’s ownership.

2,364 citations

Journal Article
TL;DR: In this paper, two major figures in adaptive control provide a wealth of material for researchers, practitioners, and students to enhance their work through the information on many new theoretical developments, and can be used by mathematical control theory specialists to adapt their research to practical needs.
Abstract: This book, written by two major figures in adaptive control, provides a wealth of material for researchers, practitioners, and students. While some researchers in adaptive control may note the absence of a particular topic, the book‘s scope represents a high-gain instrument. It can be used by designers of control systems to enhance their work through the information on many new theoretical developments, and can be used by mathematical control theory specialists to adapt their research to practical needs. The book is strongly recommended to anyone interested in adaptive control.

1,814 citations