Reinforcement Learning Repository at MSU

Topics: Applications to Theoretical Analysis

A key question regarding reinforcement learning methods is whether they will converge, in the limit, to the optimal value function. Several of the most popular discounted algorithms, such as Q-learning and TD(lambda), have convergence proofs, assuming that the value function is represented in a tabular fashion. Theoretical analysis of these methods in the general case where a nonlinear function approximator is used is still an open problem, as is the case when undiscounted methods are used.