Publications on Function Approximation (RL)
Abe,
Naoki
, Alan Biermann and Philip M. Long( nabe@us.ibm.com )
Reinforcement learning with immediate rewards and linear hypotheses
Algorithmica
(Postscript - 350KB)
Abstract:
We perform theoretical analysis of algorithms for reinforcement
learning with immediate rewards usi...
Ackley,
David
, Michael L. Littman( ackley@cs.unm.edu)
Generalization and scaling in reinforcement learning
D. S. Touretzky, editor, Advances in Neural Information Processing Systems, volume 2, pages 550--557, San Mateo, CA, 1990. Morgan Kaufmann
(Postscript - 140KB)
Abstract:
In associative reinforcement learning, an environment generates input
vectors, a learning system ge...
Baird,
Leemon
, H. Klopf( leemon@cs.cmu.edu)
Reinforcement Learning with high-dimensional continuous actions
Technical Report WL-TR-93-1147, Wright Laboratory, Wright-Patterson
Air Force Base, 1993
(HTML)
Abstract:
Many reinforcement learning systems, such as Q-learning, or advantage
updating, require that a func...
Baird,
Leemon
( leemon@cs.cmu.edu)
Residual Algorithms: Reinforcement Learning with Function
Approximation
Armand Prieditis & Stuart Russell, eds. Machine Learning:
Proceedings of the Twelfth International Conference, 9-12 July, Morgan Kaufman
Publishers, San Francisco, CA
(HTML)
Abstract:
A number of reinforcement learning algorithms have been developed that
are guaranteed to converge t...
Bhulai,
Sandjai
( sbhulai@cs.vu.nl)
Markov Decision Processes: the control of high-dimensional systems
Ph.D. Thesis, Vrije Universiteit, 2002
(Postscript - )
Abstract:
We develop algorithms for the computation of (nearly) optimal decision rules in high-dimensional sys...
Boyan,
Justin
, Andrew Moore( Justin.Boyan@cs.cmu.edu)
Generalization in Reinforcement Learning: Safely
Approximating the Value Function
Proceedings of Neural Information Processings
Systems 7, Morgan Kaufmann, January 1995 (8 pages)
(compressed Postscript - 743 KB)
Abstract:
A straightforward approach to the curse of dimensionality in
reinforcement learning and dynamic pro...
Boyan,
Justin
, Andrew W. Moore( jab@cs.cmu.edu)
Learning Evaluation Functions for Large Acyclic Domains
ICML-96
(Postscript - 147KB)
Abstract:
Some of the most successful recent applications of reinforcement
learning have used neural netw...
Carreras,
Marc
( marcc@eia.udg.es)
A Proposal of a Behavior-based Control Architecture with Reinforcement Learning for an Autonomous Underwater Robot
(pdf - 4 MB)
Abstract:
The achievement of a mission with an autonomous robot in an unknown and unstructured environment is ...
Coulom,
Rémi
( Remi.Coulom@imag.fr)
Reinforcement Learning Using Neural Networks, with Applications to Motor Control
PhD thesis
(html - 1Mb)
Abstract:
This thesis is a study of practical methods to estimate value functions with feedforward neural netw...
Coulom,
Rémi
( Remi.Coulom@free.fr)
Feedforward Neural Networks in Reinforcement Learning Applied to High-dimensional Motor Control
Proceedings of ALT2002
(pdf - 139 Kb)
Abstract:
Local linear function approximators are often preferred to feedforward neural networks to estimate v...
Dietterich,
Thomas
( tgd@cs.orst.edu)
State abstraction in MAXQ hierarchical reinforcement learning
unpublished
( gzipped Postscript - 102Kb)
Abstract:
Many researchers have explored methods for hierarchical reinforcement
learning (RL) with tempora...
Dimitrakakis,
Christos
( olethros@geocities.com)
Reinforcement Learning With Continuous Action Values
unpublished
( gzipped Postscript - 120KB)
Abstract:
The problem of reinforcement learning in the case of a continuous action set
remains largely unsolv...
Ernst,
Damien
, Pierre Geurts and Louis Wehenkel( dernst@ulg.ac.be)
Iteratively extending time horizon reinforcement learning
Proceedings of ECML 2003
(Postscript - 6 KB)
Abstract:
Reinforcement learning aims to determine an (infinite time horizon)
optimal control policy fro...
Ernst,
Damien
, Geurts Pierre, Louis Wehenkel( ernst@montefiore.ulg.ac.be)
Tree-based batch mode reinforcement learning
Journal of Machine Learning Research, April 2005, Volume 6, pp 503-556
(Pdf - 1290 KB)
Abstract:
Reinforcement learning aims to determine an optimal control policy from interaction with a system or...
Ernst,
Damien
, Pierre Geurts, Mevludin Glavic, Louis Wehenkel( ernst@montefiore.ulg.ac.be)
Approximate value iteration in the reinforcement learning context. Application to electrical power system control
International Journal of Emerging Electric Power Systems
(.pdf - 780)
Abstract:
In this paper we explain how to design intelligent agents able to process the information acquired f...
Ernst,
Damien
( ernst@montefiore.ulg.ac.be)
Selecting concise sets of samples for a reinforcement learning agent
Conference Proceedings of CIRAS 2005
(pdf - 1036 KB)
Abstract:
We derive an algorithm for selecting from the set of samples gathered by a reinforcement learning a...
Ernst,
Damien
, Raphael Marée, Louis Wehenkel( ernst@montefiore.ulg.ac.be)
Reinforcement learning with raw image pixels as state input
International Workshop on Intelligent Computing in Pattern Analysis/Synthesis (IWICPAS). Proceedings series: Lecture Notes in Computer Science, Volume 4153, page 446-454, August 2006
Abstract:
We report in this paper some positive simulation results obtained when
image pixels are directly u...
Fernandez,
Fernando
, Daniel Borrajo( ffernand@grial.uc3m.es)
Vector Quantization Applied to Reinforcement Learning
Proceedings of the Fifth Workshop on RoboCup. Stockholm, Sweden. August, 1999. IJCAI'99
(Postscript - 202 KB)
Abstract:
Reinforcement learning has proven to be a set of successful techniques for finding optimal policies ...
Fernandez,
Fernando
, Daniel Borrajo( ffernand@inf.uc3m.es)
On Determinism Handling while Learning Reduced State Space Representations
European Conference on Artificial Intelligence
(Postscript - )
Abstract:
When applying a Reinforcement Learning technique to problems with continuous or very large state spa...
Francois,
Rivest
, Doina Precup( rivestfr@iro.umontreal.ca)
Combining TD-learning with Cascade-correlation Networks
ICML 2003
Abstract:
Using neural networks to represent value
functions in reinforcement learning algorithms
often invo...
Ghory,
Imran
( imran@bits.bris.ac.uk)
Reinforcement Learning in Board Games
Technical Report CSTR-04-004, Department of Computer Science, University of Bristol, May 2004.
(pdf - 1097439 bytes)
Abstract:
This project investigates the application of the TD(lambda) reinforcement learning algorithm and neu...
Littman,
Michael
, Anthony Cassandra and Leslie Kaelbling( mlittman@cs.duke.edu)
Learning
policies for partially observable environments: Scaling up
Proceedings of the Twelfth
International Conference on Machine Learning
(Postscript - 315K)
Abstract:
Partially observable Markov decision processes (POMDPs) model decision
problems in which an agent t...
Matt,
Andreas
, Georg Regensburger( andreas.matt@uibk.ac.at)
"Reinforcement Learning for Several Environments: Theory and Applications"
A joint PhD thesis by Andreas Matt and Georg Regensburger
Abstract:
Until now reinforcement learning has been applied to learn the optimal behavior for a single environ...
Munos,
Remi
( munos@cs.cmu.edu)
Reinforcement Learning for Continuous Stochastic Control Problems
Neural Information Processing Systems, 1997
(Postscript - 809KB)
Abstract:
This paper is concerned with the problem of Reinforcement Learning for continuous state
space and ...
Munos,
Remi
( munos@cs.cmu.edu)
A convergent Reinforcement Learning algorithm in the continuous case based on a Finite Difference method
IJCAI'1997
(compressed Postscript - 225Kb)
Abstract:
In this paper, we propose a convergent Reinforcement Learning algorithm for solving optimal
contr...
Munos,
Remi
( munos@cs.cmu.edu)
A Convergent Reinforcement Learning algorithm in the continuous case : the Finite-Element Reinforcement Learning
International
Conference on Machine Learning, 1996
(Postscript - 197Kb)
Abstract:
This paper presents a direct reinforcement learning algorithm, called Finite-Element
Reinforcem...
Munos,
Remi
( munos@cs.cmu.edu)
A general convergence method for Reinforcement Learning in the continuous case
European Conference on Machine Learning, 1998
(compressed Postscript - 230Kb)
Abstract:
In this paper, we propose a general method for designing convergent Reinforcement Learning
algorit...
Munos,
Remi
( munos@cs.cmu.edu)
Finite-Element methods with local triangulation refinement for continuous Reinforcement Learning problems
European Conference on
Machine Learning, 1997
(compressed Postscript - 283Kb)
Abstract:
This paper presents a reinforcement learning algorithm for generating an adaptive control for a
...
Munos,
Remi
, Andrew Moore( munos@cs.cmu.edu)
Variable resolution discretization for high-accuracy solutions of
optimal control problems
IJCAI'99
( gzipped Postscript - 315KB)
Abstract:
State abstraction is of central importance in reinforcement learning and Markov Decision Processes. ...
Munos,
Remi
, Leemon Baird, Andrew Moore( munos@cs.cmu.edu)
Gradient Descent Approaches to Neural-Net-Based Solutions of
the Hamilton-Jacobi-Bellman Equation.
IJCNN'99
( gzipped Postscript - 128KB)
Abstract:
In this paper we investigate new approaches to
dynamic-programming-based optimal control of contin...
Munos,
Remi
( remi.munos@polytechnique.fr)
Error Bounds for Approximate Policy Iteration
Icml 2003
( gzipped Postscript - 80 KB)
Abstract:
In Dynamic Programming, convergence of algorithms such as Value Iteration
or Policy Iteration resul...
Ormoneit,
Dirk
, Saunak Sen( ormoneit@stat.stanford.edu)
Kernel-Based Reinforcement Learning
Department of Statistics, Stanford University, Technical Report No. 1999-8
(Postscript - 260 KB)
Abstract:
Kernel-based methods have recently attracted increased attention in
the machine learning literature...
Reynolds,
Stuart
( sir@cs.bham.ac.uk)
The Stability of General Discounted Reinforcement Learning with Linear Function Approximation
UKCI'02
( gzipped Postscript - 80)
Abstract:
This paper shows that general discounted return estimating
reinforcement learning algorithms ca...
Reynolds,
Stuart
( sir@cs.bham.ac.uk)
Decision Boundary Partitioning: Variable Resolution Model-Free Reinforcement Learning
ICML-2k
( gzipped Postscript - 241 KB)
Abstract:
This paper presents a method to refine the resolution of a continuous state Q-function. Q-functions ...
Reynolds,
Stuart
( sir@cs.bham.ac.uk)
Reinforcement Learning with Exploration
PhD Thesis, School of Computer Science, The University of Birmingham, B15 2TT, UK
( gzipped Postscript - 1.1MB)
Abstract:
Reinforcement Learning (RL) techniques may be used to find optimal controllers for multistep decisio...
Rivest,
Francois
, Yoshua Bengio, John Kalask( rivestfr@iro.umontreal.ca)
Brain Inspired Reinforcement Learning
NIPS 2004 (NIPS 17)
Abstract:
Successful application of reinforcement learning algorithms often involves considerable hand-craftin...
Siebel,
Nils T
( nils-NOSPAM@siebel-research.de)
Learning neural networks for visual servoing using evolutionary methods
Proceedings of the 6th International Conference on Hybrid Intelligent Systems (HIS'06), Auckland, New Zealand
(PDF - 168 KB)
Abstract:
In this article we introduce a method to learn neural networks that solve a visual servoing task. O...
Siebel,
Nils T
, Kassahun, YohannesLearning neural networks for visual servoing using evolutionary methods
Proceedings of the 6th International Conference on Hybrid Intelligent Systems (HIS'06), Auckland, New Zealand
(PDF - 168 KB)
Abstract:
In this article we introduce a method to learn neural networks that solve a visual servoing task. O...
Singh, Satinder
Reinforcement Learning With Soft State Aggregation
NIPS 7
( gzipped Postscript - )
Abstract:
It is widely accepted that the use of more
compact representations than lookup tables is crucial to scaling...
Strens,
Malcolm
( mjstrens@qinetiq.com)
Learning Multi-Agent Search Strategies
The Interdisciplinary Journal of Artificial Intelligence and the Simulation of Behaviour, 1(4), 2003.
(pdf - 305KB)
Abstract:
We identify a specialised class of reinforcement learning problem in which the agent(s) have the goa...
Sutton,
Rich
( rich@cs.umass.edu)
Generalization in Reinforcement Learning:
Successful Examples Using Sparse Coarse Coding
Advances in
Neural Information Processing Systems 8, pp. 1038-1044, MIT
Press
(compressed Postscript - 230 KB)
Abstract:
On large problems, reinforcement learning
systems must use parameterized function approximat...
Tadepalli,
Prasad
, DoKyeong Ok( tadepall@cs.orst.edu)
Scaling up average reward reinforcement learning by approximating
the domain models and the value function
Proceedings of the Thirteenth International Conference on Machine Learning, pages 471-479. Morgan Kaufmann, 1996
(Postscript - )
Abstract:
Almost all the work in Average-reward Reinforcement Learning (ARL) so
far has focused on table-base...
Tadepalli,
Prasad
, DoKyeong Ok( tadepall@cs.orst.edu)
Model-based Average Reward Reinforcement Learning
Artificial Intelligenec
(Postscript - 53 pages)
Abstract:
Reinforcement Learning (RL) is the study of programs that improve their
performance by receiving r...
Tadepalli,
Prasad
, DoKyeong Ok( tadepall@cs.orst.edu)
Model-based Average Reward Reinforcement Learning
Artificial Intelligenec
(Postscript - 53 pages)
Abstract:
Reinforcement Learning (RL) is the study of programs that improve their
performance by receiving re...
Tsitsiklis,
John
, Benjamin Van Roy( jnt@mit.edu)
Feature-Based Methods for Large Scale Dynamic Programming
Machine Learning, Vol. 22,
1996, pp. 59-94.
Abstract:
We develop a methodological framework and present a few
different ways in which dynamic programmin...
( PDF - 2.8 MB)
Van Roy,
Benjamin
( bvr@stanford.edu)
Learning and Value Function Approximation in Complex Decision Processes
PhD Thesis
(Postscript - 1691 KB)
Abstract:
In principle, a wide variety of sequential decision problems
-- ranging from dynamic resource alloc...
Wilson,
Stewart
( wilson@smith.rowland.org)
Generalization in the XCS classifier system
Genetic Programming 1998: Proceedings of the Third Annual Conference. San Francisco, CA: Morgan Kaufmann.
(gzipped Postscript - 61 KB)
Abstract:
This paper studies two changes to XCS, a classifier system in which
fitness is based on prediction...
Xu,
Xin
, Han-gen He and Dewen Hu( xuxin_mail@263.net)
Efficient Reinforcement Learning Using Recursive Least-Squares Methods
Journal of Artificial Intelligence Research, Vol.16,2002, pp:259-292
( gzipped Postscript - 700)
Abstract:
The recursive least-squares (RLS) algorithm is one of the most well-known algorithms used in adaptiv...
Yin,
ChangMing
( cmyin@cs167.net)
Forgetting Algorithm for Q-learning
unpublished
(Microsoft Word - 120)
Abstract:
...