Isotropic Sequence Order Learning using a Novel Linear Algorithm in a Closed Loop Behavioural System
Bernd Porr & Florentin Wörgötter
In this article, we present an isotropic algorithm for sequence order
learning. Its central goal is to learn the causal relation between
two (or more) inputs in order to react to the earliest incoming
signal after successful learning (like in typical classical
conditioning situations). We implement this algorithm in a behaving
system (a robot) thereby creating a closed loop situation where the
learner's actions influence its own sensor inputs to the end of
creating an autonomous agent. Autonomous behaviour implies that
learning goals are internally defined within the organism's
capabilities. Standard learning models for sequence learning (e.g.,
TD-learning) need an externally defined reward. This, however, is in
conflict with the requirement of an implicitly defined internal goal
in autonomous behaviour. Therefore, in this study we present a
system in which the external reward is replaced by a reflex
loop. This loop explicitly includes the environment. Every reflex
loop has the inherent disadvantage which is that its
re-actions occur each time just after a reflex-eliciting sensor
event and thus 'too late'. However, a reflex can serve as the
internal reference for sequence order learning which has the task of
eliminating this disadvantage by creating earlier anticipatory
actions. In our system learning is achieved by modifying synaptic
weights of a linear neuron with a correlation based learning rule
which involves the derivative of the neuron's output. All input
lines are entirely isotropic. The synaptic weight change curve of
this rule is strongly related to the temporal Hebb learning rule
which was found in spike timing experiments. We find that after
learning the reflex loop is replaced in functional terms with an
earlier anticipatory action (and pathway). In addition, we observed
that the synaptic weights stabilise as soon as the reflex remains
silent.
PDF
back to my homepage
Last modified: Tue Mar 9 23:23:22 GMT 2004