scispace - formally typeset
Search or ask a question
Author

P. S. Sastry

Bio: P. S. Sastry is an academic researcher from Indian Institute of Science. The author has contributed to research in topics: Learning automata & Automata theory. The author has an hindex of 33, co-authored 93 publications receiving 4900 citations.


Papers
More filters
Proceedings Article
27 Dec 2017
TL;DR: This paper provides some sufficient conditions on a loss function so that risk minimization under that loss function would be inherently tolerant to label noise for multiclass classification problems, and generalizes the existing results on noise-tolerant loss functions for binary classification.
Abstract: In many applications of classifier learning, training data suffers from label noise. Deep networks are learned using huge training data where the problem of noisy labels is particularly relevant. The current techniques proposed for learning deep networks under label noise focus on modifying the network architecture and on algorithms for estimating true labels from noisy labels. An alternate approach would be to look for loss functions that are inherently noise-tolerant. For binary classification there exist theoretical results on loss functions that are robust to label noise. In this paper, we provide some sufficient conditions on a loss function so that risk minimization under that loss function would be inherently tolerant to label noise for multiclass classification problems. These results generalize the existing results on noise-tolerant loss functions for binary classification. We study some of the widely used loss functions in deep networks and show that the loss function based on mean absolute value of error is inherently robust to label noise. Thus standard back propagation is enough to learn the true classifier even under label noise. Through experiments, we illustrate the robustness of risk minimization with such loss functions for learning neural networks.

627 citations

Journal ArticleDOI
01 Dec 2002
TL;DR: An attempt has been made to bring together the main ideas involved in a unified framework of learning automata and provide pointers to relevant references.
Abstract: Automata models of learning systems introduced in the 1960s were popularized as learning automata (LA) in a survey paper by Narendra and Thathachar (1974). Since then, there have been many fundamental advances in the theory as well as applications of these learning models. In the past few years, the structure of LA, has been modified in several directions to suit different applications. Concepts such as parameterized learning automata (PLA), generalized learning,automata (GLA), and continuous action-set learning automata (CALA) have been proposed, analyzed, and applied to solve many significant learning problems. Furthermore, groups of LA forming teams and feedforward networks have been shown to converge to desired solutions under appropriate learning algorithms. Modules of LA have been used for parallel operation with consequent increase in speed of convergence. All of these concepts and results are relatively new and are scattered in technical literature. An attempt has been made in this paper to bring together the main ideas involved in a unified framework and provide pointers to relevant references.

379 citations

Journal ArticleDOI
TL;DR: It is argued that for satisfactory modeling of dynamical systems, neural networks should be endowed with such internal memory as to identify systems whose order is unknown or systems with unknown delay.
Abstract: This paper discusses memory neuron networks as models for identification and adaptive control of nonlinear dynamical systems. These are a class of recurrent networks obtained by adding trainable temporal elements to feedforward networks that makes the output history-sensitive. By virtue of this capability, these networks can identify dynamical systems without having to be explicitly fed with past inputs and outputs. Thus, they can identify systems whose order is unknown or systems with unknown delay. It is argued that for satisfactory modeling of dynamical systems, neural networks should be endowed with such internal memory. The paper presents a preliminary analysis of the learning algorithm, providing theoretical justification for the identification method. Methods for adaptive control of nonlinear systems using these networks are presented. Through extensive simulations, these models are shown to be effective both for identification and model reference adaptive control of nonlinear systems. >

355 citations

Journal ArticleDOI
TL;DR: An overview of techniques of temporal data mining is presented, mainly concentrate on algorithms for pattern discovery in sequential data streams, and some recent results regarding statistical analysis of pattern discovery methods are described.
Abstract: Data mining is concerned with analysing large volumes of (often unstructured) data to automatically discover interesting regularities or relationships which in turn lead to better understanding of the underlying processes. The field of temporal data mining is concerned with such analysis in the case of ordered data streams with temporal interdependencies. Over the last decade many interesting techniques of temporal data mining were proposed and shown to be useful in many applications. Since temporal data mining brings together techniques from different fields such as statistics, machine learning and databases, the literature is scattered among many different sources. In this article, we present an overview of techniques of temporal data mining. We mainly concentrate on algorithms for pattern discovery in sequential data streams. We also describe some recent results regarding statistical analysis of pattern discovery methods.

346 citations

Book
31 Oct 2003
TL;DR: This work focuses on the development of a model for a parallel operation of the Finite Action Learning Automaton (FALA) and its applications in Pattern Classification and Decision Tree Classifiers.
Abstract: 1 Introduction- 11 Machine Intelligence and Learning- 12 Learning Automata- 13 The Finite Action Learning Automaton (FALA)- 131 The Automaton- 132 The Random Environment- 133 Operation of FALA- 14 Some Classical Learning Algorithms- 141 Linear Reward-Inaction (LR?I) Algorithm- 142 Other Linear Algorithms- 143 Estimator Algorithms- 144 Simulation Results- 15 The Discretized Probability FALA- 151 DLR?I Algorithm- 152 Discretized Pursuit Algorithm- 16 The Continuous Action Learning Automaton (CALA)- 161 Analysis of the Algorithm- 162 Simulation Results- 163 Another Continuous Action Automaton- 17 The Generalized Learning Automaton (GLA)- 171 Learning Algorithm- 172 An Example- 18 The Parameterized Learning Automaton (PLA)- 181 Learning Algorithm- 19 Multiautomata Systems- 110 Supplementary Remarks- 2 Games of Learning Automata- 21 Introduction- 22 A Multiple Payoff Stochastic Game of Automata- 221 The Learning Algorithm- 23 Analysis of the Automata Game Algorithm- 231 Analysis of the Approximating ODE- 24 Game with Common Payoff- 25 Games of FALA- 251 Common Payoff Games of FALA- 252 Pursuit Algorithm for a Team of FALA- 253 Other Types of Games- 26 Common Payoff Games of CALA- 261 Stochastic Approximation Algorithms and CALA- 27 Applications- 271 System Identification- 272 Learning Conjunctive Concepts- 28 Discussion- 29 Supplementary Remarks- 3 Feedforward Networks- 31 Introduction- 32 Networks of FALA- 33 The Learning Model- 331 G-Environment- 332 The Network- 333 Network Operation- 34 The Learning Algorithm- 35 Analysis- 36 Extensions- 361 Other Network Structures- 362 Other Learning Algorithms- 37 Convergence to the Global Maximum- 371 The Network- 372 The Global Learning Algorithm- 373 Analysis of the Global Algorithm- 38 Networks of GLA- 39 Discussion- 310 Supplementary Remarks- 4 Learning Automata for Pattern Classification- 41 Introduction- 42 Pattern Recognition- 43 Common Payoff Game of Automata for PR- 431 Pattern Classification with FALA- 432 Pattern Classification with CALA- 433 Simulations- 44 Automata Network for Pattern Recognition- 441 Simulations- 442 Network of Automata for Learning Global Maximum- 45 Decision Tree Classifiers- 451 Learning Decision Trees using GLA and CALA- 452 Learning Piece-wise Linear Functions- 46 Discussion- 47 Supplementary Remarks- 5 Parallel Operation of Learning Automata- 51 Introduction- 52 Parallel Operation of FALA- 521 Analysis- 522 ?-optimality- 523 Speed of Convergence and Module Size- 524 Simulation Studies- 53 Parallel Operation of CALA- 54 Parallel Pursuit Algorithm- 541 Simulation Studies- 55 General Procedure- 56 Parallel Operation of Games of FALA- 561 Analysis- 562 Common Payoff Game- 57 Parallel Operation of Networks of FALA- 571 Analysis- 572 Modules of Parameterized Learning Automata (PLA)- 573 Modules of Generalized Learning Automata (GLA)- 574 Pattern Classification Example- 58 Discussion- 59 Supplementary Remarks- 6 Some Recent Applications- 61 Introduction- 62 Supervised Learning of Perceptual Organization in Computer Vision- 63 Distributed Control of Broadcast Communication Networks- 64O ther Applications- 65 Discussion- Epilogue- Appendices- A The ODE Approach to Analysis of Learning Algorithms- AI Introduction- A2 Derivation of the ODE Approximation- A21 Assumptions- A22 Analysis- A3 Approximating ODEs for Some Automata Algorithms- A32 The CALA Algorithm- A33 Automata Team Algorithms- A4 Relaxing the Assumptions- B Proofs of Convergence for Pursuit Algorithm- B1 Proof of Theorem 11- B2 Proof of Theorem 57- C Weak Convergence and SDE Approximations- CI Introduction- C2 Weak Convergence- C3 Convergence to SDE- C31 Application to Global Algorithms- C4 Convergence to ODE- References

341 citations


Cited by
More filters
Book
01 Jan 1988
TL;DR: This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.
Abstract: Reinforcement learning, one of the most active research areas in artificial intelligence, is a computational approach to learning whereby an agent tries to maximize the total amount of reward it receives when interacting with a complex, uncertain environment. In Reinforcement Learning, Richard Sutton and Andrew Barto provide a clear and simple account of the key ideas and algorithms of reinforcement learning. Their discussion ranges from the history of the field's intellectual foundations to the most recent developments and applications. The only necessary mathematical background is familiarity with elementary concepts of probability. The book is divided into three parts. Part I defines the reinforcement learning problem in terms of Markov decision processes. Part II provides basic solution methods: dynamic programming, Monte Carlo methods, and temporal-difference learning. Part III presents a unified view of the solution methods and incorporates artificial neural networks, eligibility traces, and planning; the two final chapters present case studies and consider the future of reinforcement learning.

37,989 citations

Christopher M. Bishop1
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations

Journal ArticleDOI
TL;DR: This article presents a general class of associative reinforcement learning algorithms for connectionist networks containing stochastic units that are shown to make weight adjustments in a direction that lies along the gradient of expected reinforcement in both immediate-reinforcement tasks and certain limited forms of delayed-reInforcement tasks, and they do this without explicitly computing gradient estimates.
Abstract: This article presents a general class of associative reinforcement learning algorithms for connectionist networks containing stochastic units. These algorithms, called REINFORCE algorithms, are shown to make weight adjustments in a direction that lies along the gradient of expected reinforcement in both immediate-reinforcement tasks and certain limited forms of delayed-reinforcement tasks, and they do this without explicitly computing gradient estimates or even storing information from which such estimates could be computed. Specific examples of such algorithms are presented, some of which bear a close relationship to certain existing algorithms while others are novel but potentially interesting in their own right. Also given are results that show how such algorithms can be naturally integrated with backpropagation. We close with a brief discussion of a number of additional issues surrounding the use of such algorithms, including what is known about their limiting behaviors as well as further considerations that might be used to help develop similar but potentially more powerful reinforcement learning algorithms.

7,930 citations

Journal ArticleDOI
TL;DR: Convergence of Probability Measures as mentioned in this paper is a well-known convergence of probability measures. But it does not consider the relationship between probability measures and the probability distribution of probabilities.
Abstract: Convergence of Probability Measures. By P. Billingsley. Chichester, Sussex, Wiley, 1968. xii, 253 p. 9 1/4“. 117s.

5,689 citations