## The Stability of General Discounted Reinforcement Learning with Linear Function Approximation

**Reynolds, Stuart**The Stability of General Discounted Reinforcement Learning with Linear Function Approximation

* UKCI'02 *
( gzipped Postscript - 80 )

**Abstract**: This paper shows that general discounted return estimating
reinforcement learning algorithms cannot diverge to infinity when a
form of linear function approximator is used for approximating the
value-function or Q-function. The results are significant insofar
as examples of divergence of the value-function exist where similar
linear function approximators are trained using a similar
incremental gradient descent rule. A different gradient descent
error criterion is used to produce a training rule which has a
non-expansion property and therefore cannot possibly diverge.
This training rule is found to be commonly used for reinforcement
learning.