Characterizing-undefined-behavior
An undef field in the x86 state feeds unknown
values to processor components that are undefined
The undef field is used to feed unknown values to
processor components that are undefined, as per the Intel
specifications. For example, the SF, ZF, AF, and
PF rflags are undefined after a MUL instruction is
executed.
The principle behind the undef field is quite like that of the
oracle sub-field of the env field (see environment-field). We describe our use of the undef field by
comparing it to the oracle sub-field.
For reasoning about programs that involve commonly occurring
"undefined events" (like flag computations), using the oracle
sub-field can be quite tedious, since it has to be carefully
initialized; i.e., a list of appropriate (symbolic or concrete)
values has to be associated with the instruction pointer where any
such undefined event occurs. Imagine doing that for SF, ZF,
AF, and PF every time a MUL instruction is executed.
The reason why we need this initialization is because the only way to
access the oracle is through the functions pop-x86-oracle
and env-read; pop-x86-oracle expects the oracle to
contain information in a specific format, and env-read will give
us nothing unless we put something in to begin with.
On the other hand, the undef field doesn't require any such
initialization. The undef field contains a natural number that
is increased every time an undefined value is pulled for use (using
the constrained function create-undef) from a pool of undefined
values; thus, every access of the undef field causes it to
contain a new value which is used to seed a unique undefined
value. See undef-read for details.
There is a reason why we enforced that tediousness in the case of
the oracle sub-field: it provides a way of tracking any
computation that relies on the env field. Such computations don't
happen often, and when they do, it'd probably be better if we knew
exactly what we are expecting from the environment. Initializing the
oracle is a way of expressing these expectations. However, in the
case of undefined values, we aren't really expecting anything from the
environment. All we want is a sort of infinite pool of arbitary
values, seeded from undef in our case, that we don't know
anything about. As such, we wouldn't be able to prove that a value
obtained from undef is equal (or not) to any other value,
either obtained from undef or not. This is exactly what we need
when reasoning about undefined values --- an undefined value is
different from another undefined value, and also all the known
values.
Subtopics
- Undef-read
- Get a unique unknown to be used when reasoning about
undefined values in the processor