Publications on Function Approximation (RL)



Abe, Naoki , Alan Biermann and Philip M. Long( nabe@us.ibm.com )
Reinforcement learning with immediate rewards and linear hypotheses
Algorithmica (Postscript - 350KB) Abstract:
We perform theoretical analysis of algorithms for reinforcement learning with immediate rewards usi...

Ackley, David , Michael L. Littman( ackley@cs.unm.edu)
Generalization and scaling in reinforcement learning
D. S. Touretzky, editor, Advances in Neural Information Processing Systems, volume 2, pages 550--557, San Mateo, CA, 1990. Morgan Kaufmann (Postscript - 140KB) Abstract:
In associative reinforcement learning, an environment generates input vectors, a learning system ge...

Baird, Leemon , H. Klopf( leemon@cs.cmu.edu)
Reinforcement Learning with high-dimensional continuous actions
Technical Report WL-TR-93-1147, Wright Laboratory, Wright-Patterson Air Force Base, 1993 (HTML) Abstract:
Many reinforcement learning systems, such as Q-learning, or advantage updating, require that a func...

Baird, Leemon ( leemon@cs.cmu.edu)
Residual Algorithms: Reinforcement Learning with Function Approximation
Armand Prieditis & Stuart Russell, eds. Machine Learning: Proceedings of the Twelfth International Conference, 9-12 July, Morgan Kaufman Publishers, San Francisco, CA (HTML) Abstract:
A number of reinforcement learning algorithms have been developed that are guaranteed to converge t...

Bhulai, Sandjai ( sbhulai@cs.vu.nl)
Markov Decision Processes: the control of high-dimensional systems
Ph.D. Thesis, Vrije Universiteit, 2002 (Postscript - ) Abstract:
We develop algorithms for the computation of (nearly) optimal decision rules in high-dimensional sys...

Boyan, Justin , Andrew Moore( Justin.Boyan@cs.cmu.edu)
Generalization in Reinforcement Learning: Safely Approximating the Value Function
Proceedings of Neural Information Processings Systems 7, Morgan Kaufmann, January 1995 (8 pages) (compressed Postscript - 743 KB) Abstract:
A straightforward approach to the curse of dimensionality in reinforcement learning and dynamic pro...

Boyan, Justin , Andrew W. Moore( jab@cs.cmu.edu)
Learning Evaluation Functions for Large Acyclic Domains
ICML-96 (Postscript - 147KB) Abstract:
Some of the most successful recent applications of reinforcement learning have used neural netw...

Carreras, Marc ( marcc@eia.udg.es)
A Proposal of a Behavior-based Control Architecture with Reinforcement Learning for an Autonomous Underwater Robot
(pdf - 4 MB) Abstract:
The achievement of a mission with an autonomous robot in an unknown and unstructured environment is ...

Coulom, Rémi ( Remi.Coulom@imag.fr)
Reinforcement Learning Using Neural Networks, with Applications to Motor Control
PhD thesis (html - 1Mb) Abstract:
This thesis is a study of practical methods to estimate value functions with feedforward neural netw...

Coulom, Rémi ( Remi.Coulom@free.fr)
Feedforward Neural Networks in Reinforcement Learning Applied to High-dimensional Motor Control
Proceedings of ALT2002 (pdf - 139 Kb) Abstract:
Local linear function approximators are often preferred to feedforward neural networks to estimate v...

Dietterich, Thomas ( tgd@cs.orst.edu)
State abstraction in MAXQ hierarchical reinforcement learning
unpublished ( gzipped Postscript - 102Kb) Abstract:
Many researchers have explored methods for hierarchical reinforcement learning (RL) with tempora...

Dimitrakakis, Christos ( olethros@geocities.com)
Reinforcement Learning With Continuous Action Values
unpublished ( gzipped Postscript - 120KB) Abstract:
The problem of reinforcement learning in the case of a continuous action set remains largely unsolv...

Ernst, Damien , Pierre Geurts and Louis Wehenkel( dernst@ulg.ac.be)
Iteratively extending time horizon reinforcement learning
Proceedings of ECML 2003 (Postscript - 6 KB) Abstract:
Reinforcement learning aims to determine an (infinite time horizon) optimal control policy fro...

Ernst, Damien , Geurts Pierre, Louis Wehenkel( ernst@montefiore.ulg.ac.be)
Tree-based batch mode reinforcement learning
Journal of Machine Learning Research, April 2005, Volume 6, pp 503-556 (Pdf - 1290 KB) Abstract:
Reinforcement learning aims to determine an optimal control policy from interaction with a system or...

Ernst, Damien , Pierre Geurts, Mevludin Glavic, Louis Wehenkel( ernst@montefiore.ulg.ac.be)
Approximate value iteration in the reinforcement learning context. Application to electrical power system control
International Journal of Emerging Electric Power Systems (.pdf - 780) Abstract:
In this paper we explain how to design intelligent agents able to process the information acquired f...

Ernst, Damien ( ernst@montefiore.ulg.ac.be)
Selecting concise sets of samples for a reinforcement learning agent
Conference Proceedings of CIRAS 2005 (pdf - 1036 KB) Abstract:
We derive an algorithm for selecting from the set of samples gathered by a reinforcement learning a...

Ernst, Damien , Raphael Marée, Louis Wehenkel( ernst@montefiore.ulg.ac.be)
Reinforcement learning with raw image pixels as state input
International Workshop on Intelligent Computing in Pattern Analysis/Synthesis (IWICPAS). Proceedings series: Lecture Notes in Computer Science, Volume 4153, page 446-454, August 2006 Abstract:
We report in this paper some positive simulation results obtained when image pixels are directly u...

Fernandez, Fernando , Daniel Borrajo( ffernand@grial.uc3m.es)
Vector Quantization Applied to Reinforcement Learning
Proceedings of the Fifth Workshop on RoboCup. Stockholm, Sweden. August, 1999. IJCAI'99 (Postscript - 202 KB) Abstract:
Reinforcement learning has proven to be a set of successful techniques for finding optimal policies ...

Fernandez, Fernando , Daniel Borrajo( ffernand@inf.uc3m.es)
On Determinism Handling while Learning Reduced State Space Representations
European Conference on Artificial Intelligence (Postscript - ) Abstract:
When applying a Reinforcement Learning technique to problems with continuous or very large state spa...

Francois, Rivest , Doina Precup( rivestfr@iro.umontreal.ca)
Combining TD-learning with Cascade-correlation Networks
ICML 2003 Abstract:
Using neural networks to represent value functions in reinforcement learning algorithms often invo...

Ghory, Imran ( imran@bits.bris.ac.uk)
Reinforcement Learning in Board Games
Technical Report CSTR-04-004, Department of Computer Science, University of Bristol, May 2004. (pdf - 1097439 bytes) Abstract:
This project investigates the application of the TD(lambda) reinforcement learning algorithm and neu...

Littman, Michael , Anthony Cassandra and Leslie Kaelbling( mlittman@cs.duke.edu)
Learning policies for partially observable environments: Scaling up
Proceedings of the Twelfth International Conference on Machine Learning (Postscript - 315K) Abstract:
Partially observable Markov decision processes (POMDPs) model decision problems in which an agent t...

Matt, Andreas , Georg Regensburger( andreas.matt@uibk.ac.at)
"Reinforcement Learning for Several Environments: Theory and Applications"
A joint PhD thesis by Andreas Matt and Georg Regensburger Abstract:
Until now reinforcement learning has been applied to learn the optimal behavior for a single environ...

Munos, Remi ( munos@cs.cmu.edu)
Reinforcement Learning for Continuous Stochastic Control Problems
Neural Information Processing Systems, 1997 (Postscript - 809KB) Abstract:
This paper is concerned with the problem of Reinforcement Learning for continuous state space and ...

Munos, Remi ( munos@cs.cmu.edu)
A convergent Reinforcement Learning algorithm in the continuous case based on a Finite Difference method
IJCAI'1997 (compressed Postscript - 225Kb) Abstract:
In this paper, we propose a convergent Reinforcement Learning algorithm for solving optimal contr...

Munos, Remi ( munos@cs.cmu.edu)
A Convergent Reinforcement Learning algorithm in the continuous case : the Finite-Element Reinforcement Learning
International Conference on Machine Learning, 1996 (Postscript - 197Kb) Abstract:
This paper presents a direct reinforcement learning algorithm, called Finite-Element Reinforcem...

Munos, Remi ( munos@cs.cmu.edu)
A general convergence method for Reinforcement Learning in the continuous case
European Conference on Machine Learning, 1998 (compressed Postscript - 230Kb) Abstract:
In this paper, we propose a general method for designing convergent Reinforcement Learning algorit...

Munos, Remi ( munos@cs.cmu.edu)
Finite-Element methods with local triangulation refinement for continuous Reinforcement Learning problems
European Conference on Machine Learning, 1997 (compressed Postscript - 283Kb) Abstract:
This paper presents a reinforcement learning algorithm for generating an adaptive control for a ...

Munos, Remi , Andrew Moore( munos@cs.cmu.edu)
Variable resolution discretization for high-accuracy solutions of optimal control problems
IJCAI'99 ( gzipped Postscript - 315KB) Abstract:
State abstraction is of central importance in reinforcement learning and Markov Decision Processes. ...

Munos, Remi , Leemon Baird, Andrew Moore( munos@cs.cmu.edu)
Gradient Descent Approaches to Neural-Net-Based Solutions of the Hamilton-Jacobi-Bellman Equation.
IJCNN'99 ( gzipped Postscript - 128KB) Abstract:
In this paper we investigate new approaches to dynamic-programming-based optimal control of contin...

Munos, Remi ( remi.munos@polytechnique.fr)
Error Bounds for Approximate Policy Iteration
Icml 2003 ( gzipped Postscript - 80 KB) Abstract:
In Dynamic Programming, convergence of algorithms such as Value Iteration or Policy Iteration resul...

Ormoneit, Dirk , Saunak Sen( ormoneit@stat.stanford.edu)
Kernel-Based Reinforcement Learning
Department of Statistics, Stanford University, Technical Report No. 1999-8 (Postscript - 260 KB) Abstract:
Kernel-based methods have recently attracted increased attention in the machine learning literature...

Reynolds, Stuart ( sir@cs.bham.ac.uk)
The Stability of General Discounted Reinforcement Learning with Linear Function Approximation
UKCI'02 ( gzipped Postscript - 80) Abstract:
This paper shows that general discounted return estimating reinforcement learning algorithms ca...

Reynolds, Stuart ( sir@cs.bham.ac.uk)
Decision Boundary Partitioning: Variable Resolution Model-Free Reinforcement Learning
ICML-2k ( gzipped Postscript - 241 KB) Abstract:
This paper presents a method to refine the resolution of a continuous state Q-function. Q-functions ...

Reynolds, Stuart ( sir@cs.bham.ac.uk)
Reinforcement Learning with Exploration
PhD Thesis, School of Computer Science, The University of Birmingham, B15 2TT, UK ( gzipped Postscript - 1.1MB) Abstract:
Reinforcement Learning (RL) techniques may be used to find optimal controllers for multistep decisio...

Rivest, Francois , Yoshua Bengio, John Kalask( rivestfr@iro.umontreal.ca)
Brain Inspired Reinforcement Learning
NIPS 2004 (NIPS 17) Abstract:
Successful application of reinforcement learning algorithms often involves considerable hand-craftin...

Siebel, Nils T ( nils-NOSPAM@siebel-research.de)
Learning neural networks for visual servoing using evolutionary methods
Proceedings of the 6th International Conference on Hybrid Intelligent Systems (HIS'06), Auckland, New Zealand (PDF - 168 KB) Abstract:
In this article we introduce a method to learn neural networks that solve a visual servoing task. O...

Siebel, Nils T , Kassahun, Yohannes
Learning neural networks for visual servoing using evolutionary methods
Proceedings of the 6th International Conference on Hybrid Intelligent Systems (HIS'06), Auckland, New Zealand (PDF - 168 KB) Abstract:
In this article we introduce a method to learn neural networks that solve a visual servoing task. O...

Singh, Satinder
Reinforcement Learning With Soft State Aggregation
NIPS 7 ( gzipped Postscript - )

Abstract: It is widely accepted that the use of more compact representations than lookup tables is crucial to scaling...

Strens, Malcolm ( mjstrens@qinetiq.com)
Learning Multi-Agent Search Strategies
The Interdisciplinary Journal of Artificial Intelligence and the Simulation of Behaviour, 1(4), 2003. (pdf - 305KB) Abstract:
We identify a specialised class of reinforcement learning problem in which the agent(s) have the goa...

Sutton, Rich ( rich@cs.umass.edu)
Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding
Advances in Neural Information Processing Systems 8, pp. 1038-1044, MIT Press (compressed Postscript - 230 KB) Abstract:
On large problems, reinforcement learning systems must use parameterized function approximat...

Tadepalli, Prasad , DoKyeong Ok( tadepall@cs.orst.edu)
Scaling up average reward reinforcement learning by approximating the domain models and the value function
Proceedings of the Thirteenth International Conference on Machine Learning, pages 471-479. Morgan Kaufmann, 1996 (Postscript - ) Abstract:
Almost all the work in Average-reward Reinforcement Learning (ARL) so far has focused on table-base...

Tadepalli, Prasad , DoKyeong Ok( tadepall@cs.orst.edu)
Model-based Average Reward Reinforcement Learning
Artificial Intelligenec (Postscript - 53 pages) Abstract:
Reinforcement Learning (RL) is the study of programs that improve their performance by receiving r...

Tadepalli, Prasad , DoKyeong Ok( tadepall@cs.orst.edu)
Model-based Average Reward Reinforcement Learning
Artificial Intelligenec (Postscript - 53 pages) Abstract:
Reinforcement Learning (RL) is the study of programs that improve their performance by receiving re...

Tsitsiklis, John , Benjamin Van Roy( jnt@mit.edu)
Feature-Based Methods for Large Scale Dynamic Programming
Machine Learning, Vol. 22, 1996, pp. 59-94. Abstract:
We develop a methodological framework and present a few different ways in which dynamic programmin... ( PDF - 2.8 MB)

Van Roy, Benjamin ( bvr@stanford.edu)
Learning and Value Function Approximation in Complex Decision Processes
PhD Thesis (Postscript - 1691 KB) Abstract:
In principle, a wide variety of sequential decision problems -- ranging from dynamic resource alloc...

Wilson, Stewart ( wilson@smith.rowland.org)
Generalization in the XCS classifier system
Genetic Programming 1998: Proceedings of the Third Annual Conference. San Francisco, CA: Morgan Kaufmann. (gzipped Postscript - 61 KB) Abstract:
This paper studies two changes to XCS, a classifier system in which fitness is based on prediction...

Xu, Xin , Han-gen He and Dewen Hu( xuxin_mail@263.net)
Efficient Reinforcement Learning Using Recursive Least-Squares Methods
Journal of Artificial Intelligence Research, Vol.16,2002, pp:259-292 ( gzipped Postscript - 700) Abstract:
The recursive least-squares (RLS) algorithm is one of the most well-known algorithms used in adaptiv...

Yin, ChangMing ( cmyin@cs167.net)
Forgetting Algorithm for Q-learning
unpublished (Microsoft Word - 120) Abstract:
...