ALL topUMass
Home
People
Research
Pubs
Contact
Links
Restricted
RL Repository

Department of Computer Science
University of Massachusetts Amherst

Research

Abstraction

The purpose of introducing abstraction into reinforcement learning (RL) problems is two-fold: enhance an RL agent's ability to generalize over similar situations, and enable the agent to ignore irrelevant details. The ability to generalize results in a significant acceleration of learning, since an agent can apply knowledge from prior experience when confronted with a new situation. Ignoring irrelevant details reduces the amount of information that an agent needs to process, thus simplifying the problem. In ALL, we conduct research in two main areas of abstraction, both in the context of RL: temporal abstraction and spatial abstraction.

Temporal abstraction is applied when the actions available to a learning agent are temporally extended, i.e., the duration of an action is variable. In this case, abstraction is performed over the time elapsed during the execution of an action. This enables a learning agent to treat all actions similarly, regardless of their duration. An option is an abstract model of an action which facilitates temporal abstraction. The option model was developed at ALL.

In its simplest form, spatial abstraction is a partition of the set of possible states into subsets. An RL agent is indifferent to distinctions between states within a subset. Spatial abstraction has to be applied carefully, since removing the ability to distinguish between states may prohibit the agent from behaving in a functional way. Thus, the aim of spatial abstraction is to provide the agent with the ability to generalize and to ignore irrelevant detail without reducing its ability to act. One area of research pursued in ALL is autonomous spatial abstraction, in which a learning agent autonomously generates a partition of the state space based on experience. Another component of ALL research is developing an algebraic theory of abstraction in which the concept of a homomorphism between Markov decision problems (MDP) is used to provide a formal framework. Current research also includes dynamic abstractions, i.e., spatial abstraction that change during the execution of a task, depending on the needs at hand.

[ Top of page ]   [ ALL Home ]   [ Department of Computer Science ]   [ University of Massachusetts Amherst ]