"Reinforcement Learning for Several Environments: Theory and Applications"

Matt, Andreas , Georg Regensburger
"Reinforcement Learning for Several Environments: Theory and Applications"
A joint PhD thesis by Andreas Matt and Georg Regensburger

Abstract: Until now reinforcement learning has been applied to learn the optimal behavior for a single environment. The main idea of our approach is to extend reinforcement learning to learn a good policy for several environments simultaneously. We link the theory of Markov decision processes (MDP) with notions and algorithms from reinforcement learning to model this idea and to develop solution methods. We do not only focus on rigorous models and mathematical analysis, but also on applications ranging from exact computations to real world problems. The development of software forms a major part of our work. All experiments can be reproduced following the descriptions given. This allows the reader to experience reinforcement learning in an interactive way. Contributions in detail: * Mathematical treatment of MDPs and reinforcement learning for stochastic policies. ˛ Characterization of equivalent policies and geometrical interpretation of improving policies and policy improvement using the theory of polytopes. ˛ State action spaces and realizations to apply one policy to several environments. ˛ Policy improvement and the notion of balanced policies for a family of realizations. ˛ Geometrical interpretation and computation of improving policies for a family of realizations. ˛ Policy iteration and approximate policy iteration for Żnite families of realizations. ˛ MDP package for exact and symbolic computations with MDPs and two realizations. ˛ Grid world simulator SimRobo for MDPs and several realizations. ˛ Program RealRobo for the mobile robot Khepera. For more details visit our website http://mathematik.uibk.ac.at/users/rl