Storing to Memory

Next: Retrieving from Memory Up: Memory Model Previous: Memory Model

Storing to Memory

With M discrete memory storage slots, the problem then arises as to how a specific training example should be generalized. Training examples are represented here as , consisting of an angle , an action a, and a result r where is the initial position of the defender, a is ``s'' or ``p'' for ``shoot'' or ``pass,'' and r is ``1'' or ``-1'' for ``goal'' or `` miss'' respectively. For instance, represents a pass resulting in a goal for which the defender started at position on its circle.

The most straightforward technique would be to store the result at the single memory slot whose index is closest to , i.e., round to the nearest for which Mem[ ] is defined, and then set . However, this technique does not provide for the case in which we have two training examples and , where and both round to the same . In particular, there is no way of scaling how indicative is of .

In order to combat this problem, we scale the value stored to by the inverse of the distance between and relative to the distance between memory indices. A result r at a given is multiplied by before being stored to Mem[ ]. In this way, training examples with 's that are closer to can affect Mem[ ] more strongly. For example, with M = 18 (so Mem[ ] is defined for ), causes to be updated by a value of . Call this our basic memory storage technique:

tabular54

Using this generalization function, the ``update'' of would only have an effect at all if prior to this training example. Consequently, only the past training example for action a with closest to is reflected in : presumably, this training example is most likely to accurately predict . Notice that this basic memory storage technique is appropriate when the defender's motion is deterministic. In order to handle variations in the defender's speed, we introduce later a more complex memory storage technique. The method of scaling a result based on the difference between and will remain unchanged.

In our example above, would not only affect Mem[120]: as long as was not already larger, its value would be set to . Notice that any training example with could override this value. Since 116.5 is so much closer to 120 than it is to 100, it makes sense that affects Mem[120] more strongly than it affects Mem[100]. However, would affect both memory values equally. This memory storage technique is similar to the kNN and kernel regression function approximation techniques which estimate based on possibly scaled by the distance from to for the k nearest values of . In our linear continuum of defender position, our memory generalizes training examples to the 2 nearest memory locations.

Next: Retrieving from Memory Up: Memory Model Previous: Memory Model

Peter Stone
Mon Dec 11 15:42:40 EST 1995