Choosing an Action

Next: Experiments and Results Up: Learning Method Previous: Retrieving from Memory

Choosing an Action

The action selection method is designed to make use of memory to select the action most probable to succeed, and to fill memory when no useful memories were available. For example, when the defender is at position , the agent begins by retrieving and as described in Section 2.3.2. Then, it acts according to the following function:

tabular75

An action is only selected based on the memory values if these values indicate that one action is likely to succeed and that it is better than the other. If, on the other hand, neither value nor indicate a positive likelihood of success, then an action is chosen randomly. The only exception to this last rule is when one of the values is zero, suggesting that there has not yet been any training examples for that action at that memory location. In this case, there is a bias towards exploring the untried action in order to fill out memory.

Peter Stone
Mon Dec 11 15:42:40 EST 1995