# Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding

##### Citations

37,989 citations

### Cites methods from "Generalization in Reinforcement Lea..."

...The two left panels are applications to simple continuous-state control tasks using the Sarsa(λ) algorithm and tile coding, with either replacing or accumulating traces (Sutton, 1996)....

[...]

...Tile coding has been used in many reinforcement learning systems (e.g., Shewchuk and Dean, 1990; Lin and Kim, 1991; Miller, Scalera, and Kim, 1994; Sofge and White, 1992; Tham, 1994; Sutton, 1996; Watkins, 1989) as well as in other types of learning control systems (e....

[...]

6,895 citations

### Cites background from "Generalization in Reinforcement Lea..."

...Sutton (1996) shows how modi ed versions of Boyan and Moore's examples can convergesuccessfully....

[...]

...…more e cient (Cichosz & Mulawka, 1995)and on changing the de nition to make TD( ) more consistent with the certainty-equivalentmethod (Singh & Sutton, 1996), which is discussed in Section 5.1.4.2 Q-learningThe work of the two components of AHC can be accomplished in a uni ed manner…...

[...]

5,970 citations

^{1}

1,405 citations

### Additional excerpts

...Traditional reinforcement-learning algorithms for control, such as SARSA learning (Rummery and Niranjan, 1994; Sutton, 1996) and Q-learning (Watkins, 1989), lack any stability or convergence guarantees when combined with most forms of value-function approximation....

[...]

1,175 citations

### Cites background from "Generalization in Reinforcement Lea..."

...In a discrete-time SMDP [26] decisions can be made only at (positive) integer multiples of an underlying time step....

[...]

...Avoid the exhaustive sweeps of DP by restricting computation to states on, or in the neighborhood of, multiple sample trajectories, either real or simulated....

[...]

##### References

4,916 citations

### "Generalization in Reinforcement Lea..." refers methods in this paper

...Reinforcement learning is a broad class of optimal control methods based on estimating value functions from experience, simulation, or search (Barto, Bradtke & Singh, 1995; Sutton, 1988; Watkins, 1989)....

[...]

...CMACs have been widely used in conjunction with reinforcement learning systems (e.g., Watkins, 1989; Lin & Kim, 1991; Dean, Basye & Shewchuk, 1992; Tham, 1994). and Moore, we found robust good performance on all tasks....

[...]

...To apply the sarsa algorithm to tasks with a continuous state space, we combined it with a sparse, coarse-coded function approximator known as the CMAC (Albus, 1980; Miller, Gordon & Kraft, 1990; Watkins, 1989; Lin & Kim, 1991; Dean et al., 1992; Tham, 1994)....

[...]

4,803 citations

3,736 citations

### "Generalization in Reinforcement Lea..." refers background in this paper

...The acrobot is a two-link under-actuated robot (Figure 5) roughly analogous to a gymnast swinging on a highbar (Dejong & Spong, 1994; Spong & Vidyasagar, 1989 )....

[...]

3,240 citations

1,691 citations