Storing every individual experience in memory would be inefficient
both in terms of amount of memory required and in terms of
generalization time. Therefore, we store and
only at
discrete, evenly-spaced values of
. That is, for a memory of
size M (with M dividing evenly into 360 for simplicity), we keep
values of
and
for
. We store memory as an array ``Mem'' of size M
such that Mem[n] has values for both
and
. Using a fixed memory size precludes using memory-based
techniques such as K-Nearest-Neighbors (kNN) and kernel regression
which require that every experience be stored, choosing the most
relevant only at decision time. Most of our experiments were conducted
with memories of size 360 (low generalization) or of size 18 (high
generalization), i.e. M = 18 or M = 360. As will be seen from our
results, the memory size had a large effect on the rate of learning.