[Mike Rosenstein]


Supervised actor-critic reinforcement learning
M.T. Rosenstein and A.G. Barto. Supervised actor-critic reinforcement learning. In J. Si, A. Barto, W. Powell, and D. Wunsch, eds., Learning and Approximate Dynamic Programming: Scaling Up to the Real World. John Wiley & Sons, Inc., New York, 359-380, 2004.
Editor's Summary: Chapter 7 introduced policy gradients as a way to improve on stochastic search of the policy space when learning. This chapter presents supervised actor-critic reinforcement learning as another method for improving the effectiveness of learning. With this approach, a supervisor adds structure to a learning problem and supervised learning makes that structure part of an actor-critic framework for reinforcement learning. Theoretical background and a detailed algorithm description are provided, along with several examples that contain enough detail to make them easy to understand and possible to duplicate. These examples also illustrate the use of two kinds of supervisors: a feedback controller that is easily designed yet sub-optimal, and a human operator providing intermittent control of a simulated robotic arm.
Download: pdf (623KB), ps.gz (894KB)
See also:
  Combining reinforcement learning with a local control algorithm

updated 27-Aug-2004
mtr@cs.umass.edu