Canonical representation of the computation states for the symbolic execution.
Starting from a generic (symbolic) computation state, a symbolic execution starting with exec-fun will push a frame (for the function), possibly read variables, possibly create new variables, possibly write existing variables, possibly enter new scopes, possibly exit existing scopes, and eventually pop the frame. Just one frame is pushed and then popped because the symbolic execution is compositional w.r.t. functions, i.e. the symbolic execution rewrites function calls in function bodies using the theorems about the called functions.
The dynamic semantics functions that perform the above actions, namely push-frame, enter-scope, create-var, etc., go into the frame stack component of the computation state, via the compustate->frames accessor. That would lead to a complex symbolic term for the computation state.
Instead, we pull the ``additions'' to the computation state,
i.e. the added frames, scopes, and variables,
out of the computation state via the functions
add-frame, enter-scope, and add-var:
the first and third one are defined below,
while the second one is from the dynamic semantics of C.
These three functions are defined to
push the frames, scopes, and variables into the computation state,
but we leave these functions disabled during symbolic execution,
so that the symbolic computation states has these additions explicit.
Thus, the symbolic computation state is
a sequence of applications of those three functions
to an initial symbolic computation state
(add-var ... (enter-scope (add-var ... (add-frame ... <compst>)...)
The reason for introducing a new function add-var, instead of just using create-var (and leaving it disabled), is that add-var unconditionally satisfies certain properties that create-var satisfies when it does not return an error: add-var unconditionally returns a computation state, and always ensures that the added variable is in the computation state. During symbolic execution, create-var is replaced with add-var via a rule that requires create-var to not return an error: in a way, add-var ``caches'' the aforementioned properties. This leads to simpler rules on this kind of computation states: for example, when read-var appears applied to add-var during symbolic execution, if the two variables are equal, we can readily rewrite this term to the value in add-var, because that function guarantees the variable to exist with that value. In contrast, if read-var appeared applied to create-var, additional hypotheses would be needed (and would have to be discharged) saying that create-var succeeds.
For a similar reason, we introduce a function update-var that replaces write-var when the latter succeeds, and that is easier and more efficient to manipulate during symbolic execution.
Given these canonical representations of computation states, we prove theorems that describe the effect of push-frame and other functions on computation states of this form, where the effect is another state in that form. These theorems are enabled during symbolic execution, and manipulate the computation state.
In the presence of C loops, which are represented by ACL2 recursive functions, we generate theorems that describe the execution of the loops starting from generic (symbolic) computation states. The execution of a loop does not push a new frame, because the loop executes in the frame of the enclosing C function. In this case, the initial generic computation state includes part of the frame of the enclosing C function; the execution of the loop may add new scopes and variables, so in this case the symbolic computation state looks like
(add-var ... (add-var ... (enter-scope <compst>)...)
In fact, the innermost function there must be enter-scope (it cannot be add-var), because the loops we generate have compound statements as bodies, which create new scopes.
The initial symbolic computation state
(... (add-var (enter-scope ... (update-var ... (update-var ... <compst>)...)
Below we introduce rules to order these update-vars according to the variables, maintaining a canonical form.
Note that this form of the computation states serves to represent side effects performed by the loop on the initial computation state. The same approach is used to generate proofs for more general side effects, e.g. on the heap, as explained below.
A computation state has a heap, whose objects (particularly arrays and structures) may be updated during the symbolic execution. We represent these updates via the function update-object, which is similar to write-object but always satisfies additional properties: the relation between update-object and write-object is similar to the one between update-var and write-var and to the one between add-var and create-var, explained above.
Heap objects may be updated by C functions and C loops, so they need to be incorporated in both of the symbolic computation state representations described above. We push the update-object past all the other functions, leading to states of the form
(... (add-frame (update-object ... (update-object ... <compst>)...)
for C functions and of the form
(... (enter-scope (update-object ... (update-object ... <compst>)...)
for C loops.
We order the update-object calls
according to their first argument (i.e. the object designator).
Note that for a C function this first argument is
value-pointer->designator applied to an ACL2 variable,
while for a C loop it is
value-pointer->designator applied to
a
Objects in static storage are treated similarly to objects in the heap. Instead of write-object and update-object, we use write-static-var and update-static-var. In a canonical computation state, we order update-static-var calls before update-object calls.
After introducing the ACL2 functions that represent the canonical symbolic computation states, we provide theorems expressing how functions like push-frame transform those computation states, maintaining their canonical form.