Adapted from a presentation given at the
Fifteenth National Conference on Artificial Intelligence
Madison, WI July 29, 1998
Contents:
- Title
- Motivation
- Grounded Sensorimotor Ontology
- Outline
- Categories
- Concepts
- Cluster Analysis
- Delay Coordinates 1
- Delay Coordinates 2
- Delay Coordinates 3
- Pursuit/Avoidance Simulator
- Cluster Events
- Cluster Prototypes
- Pioneer Mobile Robot
- Interaction With Objects
- Cluster Hierarchy 1
- Cluster Hierarchy 2
- Cluster Hierarchy 3
- Summary
slide 1
(contents)
slide 2
(contents)
Motivation
Autonomous, sensorimotor agent
Minimal knowledge engineering
Learn from interaction, not supervision
The work I'm presenting was motivated by a
much larger project going on at UMass. What we'd like to do is take
an autonomous sensorimotor agent, like the Pioneer robot shown here,
let it loose in some environment, and have it build a conceptual
structure about its surroundings.
The project is ambitious because we work from as little built-in
knowledge as possible, and we require the agent to learn from
interaction, not supervision. It's also a challenging project because
we want the robot to learn not just a grounded sensorimotor ontology,
but also structured activities and natural language.
Obviously, the above diagram is a gross simplification of the project,
although it does show three areas of active research in our lab. At
the top, Tim Oates is working on
language learning and has some interesting results that show one route
from activities to verbs. In the middle, Matt Schmill is working on learning
activities and developed a control language that builds structured
behavior out of primitive actions like move and
turn.
This presentation describes work at the lowest level -- at the
interface between agent and environment -- where the robot needs to
learn a set of grounded categories that supports a larger conceptual
structure about activities and language.
slide 3
(contents)
Grounded Sensorimotor Ontology
This slide is an overview of the
problem. We have an agent that can take measurements of the
environment and perform primitive actions like turn or
raise-gripper. From its many sensor readings, we want the
robot to build a set of categories that carve up the world in some
meaningful way.
For example, our robot can interact with objects, so it's nice if it
can distinguish between experiences with movable and
immovable objects. It's also nice to have a category
hierarchy. With the contact activity, for instance, one
possibility is to have a more detailed breakdown by movement speed.
Now the goal of this presentation is to show how an agent can build a
grounded sensorimotor ontology. The first part of this goal is
satisfied because all of the results ground out in real-world
measurements of the agent's sensorimotor experience. To get the
second part -- to get a conceptual structure -- we certainly need
categories as shown here, but there's something missing from this
diagram. A concept is something more than a category and I'll make
this distinction a few slides from now.
One other point of clarification for this slide is that the labels are
my own interpretation, not the robot's. So some categories correspond
to objects and activities, but as far as the robot is concerned,
they're just categories of uninterpreted sensor readings.
slide 4
(contents)
Outline
Categories
Concepts
Implementation
Cluster analysis
Delay coordinates
Simulation results
Robot results
Summary
The next part of this presentation will
be a quick review of categories and I'll give a working definition of
concepts. I'll also describe our implementation using cluster
analysis, and I'll show the role of delay coordinates. The second
part of this presentation will give a sample of results from
simulations of a pursuit/avoidance game as well as from the Pioneer
robot.
slide 5
(contents)
Categories
Now for a quick review of some
terminology: First we have instances, or examples, of objects and
activities. This work is about concepts from time series, so the
instances I'll be dealing with are streams of sensor readings.
Next, a category is just a set of instances, although we usually
expect the members of a category to have something in common. So in
this case, what the members have in common is a similar temporal
pattern.
Sometimes we also deal with abstractions of categories called
prototypes. A prototype is a representative category member or
best example. Here, the prototype is an average of the individual
time series in the preceding category. The 3 inside the box is meant
to show that sometimes our data structure contains additional
information along with the prototype; in this case its the number of
curves that were averaged together to get the representative.
slide 6
(contents)
Concepts
As mentioned earlier, there's a
distinction between categories and concepts. Epistemically, a concept
is a category plus its entailments. Somewhere in our thought process
we have categories along with an understanding of the consequences of
category membership.
A similar but more functional definition is that a concept is a
prototype plus predictions. Once we have a prototype, we can use
matching algorithms to recognize that a new experience is similar to
an old one and then we can predict what will happen. It's these
predictions that give meaning to the prototype or category.
For example, the time series here are a prototype based on five
instances of a robot bumping into a wall. If we just look at the top
two curves, we see that the robot has a category of experience where
velocity is positive and roughly constant and sonar readings are
decreasing. The prediction then, is that the bump sensor will go off
in a moment and velocity will drop to zero and the sonar
readings will match the amount that the gripper extends from the body
of the robot.
Now we could label this prototype with a word like contact, and
when we hear the word, we immediately have some notion of what contact
looks like and what it means. But at this point, the robot has
arbitrary symbols for its category prototypes. So the meaning then
comes from predictions the robot can make about its sensor readings.
slide 7
(contents)
Cluster Analysis
Category = cluster of points in "instance space"
Unsupervised, data-driven
Method of Delays (delay coordinate embedding)
time series => point in "delay space"
cluster experiences with similar dynamics
Euclidean distance quantifies instance similarity
The hard part of building a sensorimotor
ontology is coming up with the categories. Cluster analysis is one
general technique to go from instances to categories, and what we do
is think of a category as a cluster of points in a suitable space.
The nice thing about cluster analysis is that it's unsupervised and
data-driven. We don't pre-label instances, and we don't build in
domain knowledge. The problem is how do we design a suitable space for
time series, and what's our measure of similarity?
To make clustering work with time series, I use a Nonlinear Dynamics tool
called the method of delays, or delay-coordinate embedding, or
sometimes just delay coordinates. The schematic shows what the
method of delays gives us. We get a way to map part of a time series
to a point in delay space. And then we have a way to cluster
experiences with similar dynamics. Now every clustering algorithm
needs a measure of similarity, and in delay space we simply use the
distance between points.
The objective of this work is that we get categories of sensorimotor
experience by clustering time series. The way I do it is to use delay
coordinates, but that's not the only possibility. In fact, Tim Oates in our lab has nice results
using a dynamic programming algorithm that quantifies the difference
between two curves. There's also related work by Iba & Langley (IJCAI
1991, 1993), Pierce & Kuipers (AI 1997), Ram & Santamaria (AI 1997),
and possibly others.
slide 8
(contents)
Delay Coordinates 1
Delay coordinates are a nice choice for
this problem for at least a few reasons. First, there's an easy
implementation. On the left is a time series and on the right is a
spatial representation called a delay portrait. The delay portrait is
just a plot of the time series versus a delayed version of itself.
The colored dots show how we go from the temporal domain to the
spatial domain. For example, the two time series measurements in
red are separated by some time delay -- five
seconds in this case -- and become the two coordinates of a point in
delay space. This approach extends easily to higher dimensional
spaces. If we were working with, say, a seven-dimensional
reconstruction, then we would take seven measurements along the time
series and map them to a single point in delay space.
slide 9
(contents)
Delay Coordinates 2
Easy implementation
Simple interpretation of instance similarity
Another nice thing about delay
coordinates is that there's a simple interpretation of instance
similarity. This example shows two instances of slightly different
sine waves. Now suppose we want to map these series to a
seven-dimensional delay space. (Notice the seven red dots on these curves.) If we think of one time
series as a target and the other as an estimate, then we can subtract
to get an error signal. For clustering, our measure of instance
similarity is distance in delay space -- which we compute from this
error signal. If we then normalize by the square root of the number
of delay coordinates, then we get an estimate of the root-mean-square
error between the two time series.
slide 10
(contents)
Delay Coordinates 3
Easy implementation
Simple interpretation of instance similarity
Theory about dynamics (Takens theorem)
A third benefit of delay coordinates is
that it's a technique based on theory about dynamics. I'm glossing
over some details, but the important thing is that Takens theorem
tells us that delay coordinates preserve a dynamical system's
topological invariants.
One way to understand what that means is we think of the system as a
set of trajectories drawn on a sheet of rubber in state space. That
state space may not be fully observable, but we can use delay
coordinates to reconstruct certain properties. The effect is a
stretching or twisting of the rubber sheet. The important thing for
cluster analysis is that nearest neighbors in state space remain near
one another under this sort of transformation. The conjecture, then,
is that we get the same clusters in delay space as we would get in
state space.
slide 11
(contents)
Pursuit/Avoidance Simulator
Outcomes: contact, chase, escape
Agent behavior: acceleration = intensity * f(position, velocity)
(A Java version of the simulator is available.)
The first set of clustering results come
from experiments with a pursuit/avoidance simulator. This screen shot
shows one agent chasing another. The possible outcomes of each game
are contact, chase, and escape, and each agent's
behavior is a function of relative position and relative velocity (and
there's also a scale factor to change the intensity of the player).
The nice thing about this simulator is that it's a very simple,
controlled environment where we can test ideas before moving on to the
more complicated robot domain.
slide 12
(contents)
Cluster Events
Now in the life of an agent, there can
be an awful lot of uninteresting activity, so to help sort through the
data we make use of cluster events. This graph shows the
distance between players versus time for three different runs of the
simulator. One way we define events in this domain is to fix the
distance between agents; in this graph the distance is 0.6. So an
event is triggered when the distance first drops below the (gray)
horizontal line, and then the algorithm grabs a window of measurements
that become a point for clustering in the corresponding delay space.
slide 13
(contents)
Cluster Prototypes
This graph shows the six prototypes
pulled from 100 games in the simulator. For each category, the
prototype shows the average time series for the distance between the
two players. This is how the learning algorithm carved up the world;
it found six qualitatively different categories.
One category is contact where the distance between players
drops and never recovers. Another prototype shows the case where the
pursuer is too agressive and overshoots it opponent before
making contact. The other prototypes show a perpetual chase
and escape with short, medium, and long escape times.
And just as a reminder, this is an unsupervised method. The labels on
this graph are for illustration only, and the learning algorithm never
sees them.
So this plot shows that we can get prototypes. To get concepts we
also need predictions. In this domain it's predictions of the game
outcome: contact, chase, or escape. The prediction results were not
nearly as interesting as those for cluster prototypes, and all the
details can be found in our proceedings paper. The important
thing is that the use of prototypes to recognize the situation and
then make the prediction was always better than random guessing.
slide 14
(contents)
Pioneer Mobile Robot
The next set of results come from
experiments with the Pioneer mobile robot. The configuration I work
with has around 40 sensors. There are seven sonars not visible in
this top view. The wheels have shaft encoders to get velocity and
stall status. The gripper paddles have two infrared break beams and a
bump switch. The vision system can detect objects with one of three
colors and report things like x,y location, height, width, and
distance to the object.
slide 15
(contents)
Interaction With Objects
For the results I'm about to show, I
instructed the robot to drive toward various objects. There was a
bucket, a wall, large and small cups, and the leg of a desk not shown
here. I also ran a pair of tests where the robot pushes either a cup or a
ball and then stops short. The difference is that the cup stays still
but the ball rolls away.
Just like the pursuit/avoidance simulator, for each trial I picked the
delay coordinates from a window of sensor readings surrounding an
event. An event can be the start of some action, like stopping in the
push examples, or it can be a sudden change in sensor readings, like
the bump switch going off or a red object suddenly disappearing from
view.
slide 16
(contents)
Cluster Hierarchy 1
This binary tree shows a cluster
hierarchy of the robot's experience with the various objects. The
arbitrary labels are meant to emphasize the robot's situation of
ignorance. The important thing is that experiences with the same
object clustered together into one basic level category.
slide 17
(contents)
Cluster Hierarchy 2
Here's the same cluster hierarchy with
the individual instances pruned away. We can see individual
clusters for the desk, the wall, the bucket, and so on. The icons are
meaningful labels for us, but the robot isn't privileged to see them.
What the robot stores at the nodes in the tree are prototypes; it
stores average time series.
slide 18
(contents)
Cluster Hierarchy 3
Here's the tree, one more time, with
words that describe the clusters. For example, there's a large
cluster for tall objects that show up in the sonar readings. There's
also a cluster for tall, grippable objects that cause the break beams
to go off and a cluster for tall, ungrippable objects that cause the
bump switch to go off instead.
Now this looks a lot like an object hierarchy, but our claim isn't
that strong (at least not yet). It's really an experience hierarchy,
where "experience" refers to uninterpreted sensor readings.
slide 19
(contents)
Summary
Categories of sensorimotor experience
cluster time series of real-world measurements
unsupervised
implementation with delay coordinates
Functional definition of concepts
prediction gives "meaning" to category prototypes
grounded in real-world measurements
Examples
pursuit/avoidance simulator
Pioneer mobile robot
To summarize this work, I showed a way
an agent can build categories of sensorimotor experience. The key
idea is to cluster time series of measurements out in the world, and
to do this in an unsupervised fashion. My implementation made use of
delay coordinates in the clustering procedure.
I also gave a functional definition of concepts, where prediction
gives meaning to categories and their prototypes. These predictions
ground out in real-world measurements, and so we have a way of
building a grounded sensorimotor ontology.
Finally, I showed examples from two very different experimental domains.
Copyright (c) 1998 Michael T. Rosenstein. All rights reserved.
Home Pages: UMass
| CS Dept.
| ANW
| MTR
Updated 26-May-99
mtr@cs.umass.edu