scispace - formally typeset
Search or ask a question
Author

Richard E. Korf

Bio: Richard E. Korf is an academic researcher from University of California, Los Angeles. The author has contributed to research in topics: Search algorithm & Beam search. The author has an hindex of 49, co-authored 133 publications receiving 9123 citations. Previous affiliations of Richard E. Korf include Columbia University & Carnegie Mellon University.


Papers
More filters
Journal ArticleDOI
TL;DR: This heuristic depth-first iterative-deepening algorithm is the only known algorithm that is capable of finding optimal solutions to randomly generated instances of the Fifteen Puzzle within practical resource limits.

1,698 citations

Journal ArticleDOI
TL;DR: A variation of minimax lookahead search, and an analog to alpha-beta pruning that significantly improves the efficiency of the algorithm, and a new algorithm, called Real-Time-A∗, for interleaving planning and execution, which proves that the algorithm makes locally optimal decisions and is guaranteed to find a solution.

989 citations

Journal ArticleDOI
TL;DR: It is presented that planning can be viewed as problem-solving search using subgoals, macro-operators, and abstraction as knowledge sources and an analysis of abstraction concludes that abstraction hierarchies can reduce exponential problems to linear complexity.

424 citations

Journal ArticleDOI
TL;DR: The macro technique is a new kind of weak method, a method for learning as opposed to problem solving, and introduces a new type of problem structure called operator decomposability.

327 citations

Journal ArticleDOI
TL;DR: This work presents a linear-space best-first search algorithm (RBFS) that always explores new nodes in best- first order, regardless of the cost function, and expands fewer nodes than iterative deepening with a nondecreasing cost function.

326 citations


Cited by
More filters
Book
01 Jan 1988
TL;DR: This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.
Abstract: Reinforcement learning, one of the most active research areas in artificial intelligence, is a computational approach to learning whereby an agent tries to maximize the total amount of reward it receives when interacting with a complex, uncertain environment. In Reinforcement Learning, Richard Sutton and Andrew Barto provide a clear and simple account of the key ideas and algorithms of reinforcement learning. Their discussion ranges from the history of the field's intellectual foundations to the most recent developments and applications. The only necessary mathematical background is familiarity with elementary concepts of probability. The book is divided into three parts. Part I defines the reinforcement learning problem in terms of Markov decision processes. Part II provides basic solution methods: dynamic programming, Monte Carlo methods, and temporal-difference learning. Part III presents a unified view of the solution methods and incorporates artificial neural networks, eligibility traces, and planning; the two final chapters present case studies and consider the future of reinforcement learning.

37,989 citations

Journal ArticleDOI
01 Apr 1988-Nature
TL;DR: In this paper, a sedimentological core and petrographic characterisation of samples from eleven boreholes from the Lower Carboniferous of Bowland Basin (Northwest England) is presented.
Abstract: Deposits of clastic carbonate-dominated (calciclastic) sedimentary slope systems in the rock record have been identified mostly as linearly-consistent carbonate apron deposits, even though most ancient clastic carbonate slope deposits fit the submarine fan systems better. Calciclastic submarine fans are consequently rarely described and are poorly understood. Subsequently, very little is known especially in mud-dominated calciclastic submarine fan systems. Presented in this study are a sedimentological core and petrographic characterisation of samples from eleven boreholes from the Lower Carboniferous of Bowland Basin (Northwest England) that reveals a >250 m thick calciturbidite complex deposited in a calciclastic submarine fan setting. Seven facies are recognised from core and thin section characterisation and are grouped into three carbonate turbidite sequences. They include: 1) Calciturbidites, comprising mostly of highto low-density, wavy-laminated bioclast-rich facies; 2) low-density densite mudstones which are characterised by planar laminated and unlaminated muddominated facies; and 3) Calcidebrites which are muddy or hyper-concentrated debrisflow deposits occurring as poorly-sorted, chaotic, mud-supported floatstones. These

9,929 citations

Journal ArticleDOI
TL;DR: This article introduces a class of incremental learning procedures specialized for prediction – that is, for using past experience with an incompletely known system to predict its future behavior – and proves their convergence and optimality for special cases and relation to supervised-learning methods.
Abstract: This article introduces a class of incremental learning procedures specialized for prediction – that is, for using past experience with an incompletely known system to predict its future behavior. Whereas conventional prediction-learning methods assign credit by means of the difference between predicted and actual outcomes, the new methods assign credit by means of the difference between temporally successive predictions. Although such temporal-difference methods have been used in Samuel's checker player, Holland's bucket brigade, and the author's Adaptive Heuristic Critic, they have remained poorly understood. Here we prove their convergence and optimality for special cases and relate them to supervised-learning methods. For most real-world prediction problems, temporal-difference methods require less memory and less peak computation than conventional methods and they produce more accurate predictions. We argue that most problems to which supervised learning is currently applied are really prediction problems of the sort to which temporal-difference methods can be applied to advantage.

4,803 citations

Journal ArticleDOI
TL;DR: In this article, a Bayesian approach for learning Bayesian networks from a combination of prior knowledge and statistical data is presented, which is derived from a set of assumptions made previously as well as the assumption of likelihood equivalence, which says that data should not help to discriminate network structures that represent the same assertions of conditional independence.
Abstract: We describe a Bayesian approach for learning Bayesian networks from a combination of prior knowledge and statistical data. First and foremost, we develop a methodology for assessing informative priors needed for learning. Our approach is derived from a set of assumptions made previously as well as the assumption of likelihood equivalence, which says that data should not help to discriminate network structures that represent the same assertions of conditional independence. We show that likelihood equivalence when combined with previously made assumptions implies that the user's priors for network parameters can be encoded in a single Bayesian network for the next case to be seen—a prior network—and a single measure of confidence for that network. Second, using these priors, we show how to compute the relative posterior probabilities of network structures given data. Third, we describe search methods for identifying network structures with high posterior probabilities. We describe polynomial algorithms for finding the highest-scoring network structures in the special case where every node has at most k e 1 parent. For the general case (k > 1), which is NP-hard, we review heuristic search algorithms including local search, iterative local search, and simulated annealing. Finally, we describe a methodology for evaluating Bayesian-network learning algorithms, and apply this approach to a comparison of various approaches.

4,124 citations

Journal ArticleDOI
TL;DR: It is shown that options enable temporally abstract knowledge and action to be included in the reinforcement learning frame- work in a natural and general way and may be used interchangeably with primitive actions in planning methods such as dynamic pro- gramming and in learning methodssuch as Q-learning.

3,233 citations