Mark Ring, Tom Schaul, Jürgen
Schmidhuber. The
Two-Dimensional Organization of Behavior. In Proc. Joint
IEEE International Conference on Development and Learning (ICDL) and
on Epigenetic Robotics (ICDL-EpiRob 2011), Frankfurt, 2011.
Abstract
This paper addresses the problem of continual learning (Ring, 1994)
in a new way, combining multi-modular reinforcement learning with
inspiration from the motor cortex to produce a unique perspective on
hierarchical behavior. Most reinforcement-learning agents represent
policies monolithically using a single table or function
approximator. In those cases where the policies are split among a
few different modules, these modules are related to each other only
in that they work together to produce the agent's overall policy. In
contrast, the brain appears to organize motor behavior in a
two-dimensional map, where nearby locations represent similar
behaviors. This representation allows the brain to build hierarchies
of motor behavior that correspond not to hierarchies of subroutines
but to regions of the map such that larger regions correspond to
more general behaviors. Inspired by the benefits of the brain's
representation, the system presented here is a first step and the
first attempt toward the two-dimensional organization of learned
policies according to behavioral similarity. We demonstrate a fully
autonomous multi-modular system designed for the constant
accumulation of ever more sophisticated skills (the
continual-learning problem). The system can split up a complex task
among a large number of simple modules such that nearby modules
correspond to similar policies. The eventual goal is to develop and
use the resulting organization hierarchically, accessing behaviors
by their location and extent in the map.