Fgl-interpreter-overview
Outline of the way the FGL interpreter works.
The FGL interpreter is called by the FGL clause processor in order to try
and turn the theorem term into an equivalent Boolean formula, which then can
perhaps be solved by SAT. In order to do this, it recursively interprets terms
turning them into symbolic objects (see fgl-object) containing Boolean formula
objects. In this doc topic we outline the operation of the interpreter.
The interpreter consists of a 31-way mutual recursion. We won't detail each
of the functions in the mutual recursion, but we will try to cover in general
terms all of the things that happen in them.
Interpreting Terms -- Overview
The highest-level entry point into the interpreter when trying to compute a
Boolean formula from a term is fgl-interp-test. This first interprets the
term using another possible high-level entry point, fgl-interp-term-equivs,
which calls fgl-interp-term to produce a symbolic object (we'll cover its
operation shortly), then checks a database of equivalences that have been
assumed in the current path condition and replaces that symbolic object with an
equivalent if there are any that have been assumed (and which some heuristics
say are a good replacement). Finally, it calls fgl-interp-simplify-if-test
which is the subroutine for coercing a symbolic object into a Boolean
formula.
fGl-interp-term is very similar to a classical interpreter or rewriter.
It examines the term and treats different kinds of terms differently:
- When it is a constant (quote), it returns the quoted value, coerced to a g-concrete symbolic object.
- When it is a variable, it looks up the variable in the current
bindings (alist). There are actually two sets of bindings for our interpreter:
"minor" bindings, which come from lambdas, and "major" bindings, which come
from outermost contexts not inside any lambdas; these are checked in that
order.
- When it is a lambda application, it processes the term into an equivalent
pair (bindinglist body). The bindinglist is a list of pairs (formals
actuals) from the nest of lambdas, omitting formal/actual pairs that are the
same. This may include the bindings of more than one lambda: if the body of a
lambda is itself a lambda application, it recursively adds that to the
bindinglist, stopping when it finds a lambda body that is not itself a lambda.
The interpreter interprets the bindinglist by recurring through the list,
recursively interpreting each of the actuals of each pair with
fgl-interp-term-equivs. When a pair is done it adds the bindings formed by
pairing the formals with the symbolic object results from the actuals to the
minor bindings of the interpreter. When done with all the (formals
actuals) pairs, it then recursively interprets the body, then pops off the
bindings produced by the bindinglist, returning the symbolic object resulting
from interpreting the body.
- When it is a function call, it deals with a few special cases, described
next, and then the generic function call case. In the generic case, it first
recursively interprets the arguments of the function, then calls
fgl-interp-fncall, described below, on the function and symbolic objects
resulting from the arguments.
Interpreting Function Calls -- Special Cases
The special cases of function calls include if, return-last,
bind-var, abort-rewrite, fgl-sat-check, syntax-interp-fn,
assume, narrow-equiv, and fgl-interp-obj. Each of these requires
special treatment of the arguments, rather than just recursively interpreting
them:
- For if terms, the test is recursively interpreted and coerced to a
Boolean function using fgl-interp-test. Then, unless a syntactic analysis
shows that the path condition implies the test's negation, we recursively
interpret the "then" branch with fgl-interp-term-equivs and with the test
conjoined to the path condition, and unless the syntactic analysis shows the
path condition implies the test, we recursively interpret the "else" branch
with the negated test conjoined to the path condition. If both branches were
interpreted, we then attempt to merge the results from the two branches into a
single symbolic object using fgl-interp-merge-branches, described
below.
- For return-last, we provide special support for time$, allowing
it to time the symbolic interpretation of the enclosed term. Otherwise, we
simply interpret the last argument. Note this means that prog2$ might not
function as expected -- if you intend the first argument to prog2$ to be
interpreted in FGL for side effects, you should instead bind it to an ignored
variable or else use an equivalent two-argument function instead.
- For bind-var, we bind the first argument, which must be a variable
that is not yet bound in the current major or minor frame, to the result of
interpreting the second argument under the unequiv equivalence relation.
Under this equivalence relation, every object is equivalent to every other
object; this is sort of an "anything goes" mode which allows certain behavior
that would normally be unsound. This is allowable in this context because
bind-var logically just returns its first argument (the variable), so the
second argument is irrelevant. It is allowable to bind anything to the
variable if it is not yet bound ****
- For binder, the argument must be a function call whose first argument
is a variable, and that variable must be free in the current stack frame. The
rest of that function's arguments are recursively interpreted, and then binder
rewrite/meta rules are applied to choose a binding for the free variable based
on the rest of the arguments.
- For abort-rewrite, the interpreter returns nil and inserts a
special error keyword in the interpreter state's errmsg field, which will
get caught when returning from the current major stack frame.
- For fgl-sat-check, we use fgl-interp-test to coerce the second
argument (the condition to be tested) to a Boolean function, and
fgl-interp-term-equivs to interpret the first argument (params). We then
call interp-st-sat-check, an attachable function which calls SAT and
returns NIL if the input Boolean formula is unsat.
- For syntax-interp-fn, we check that the second argument is a quoted
term (this is true for calls generated by the syntax-interp and
syntax-bind macros) and that we're in an unequiv equivalence
context. If so, we evaluate the first argument using fancy-ev, with the
current variable bindings from the symbolic interpreter passed in as the
evaluation environment. For example, if a variable x is bound to a
symbolic integer in our current interpreter frame, then x will be bound to
the fgl-object representation of that symbolic integer when evaluating
the syntax-interp term. This is similar to ACL2's syntaxp
behavior, but the syntaxp term operates on FGL object syntax rather than ACL2
term syntax.
- For assume, we first check that we're in an unequiv equivalence
context. Then we interpret the first and second arguments as if they were the
test and then branches of an if; in particular, the Boolean
formula resulting from the test is conjoined onto the path condition (that
is, assumed) when symbolically executing the then branch. We then simply
return the result from the then branch.
- For narrow-equiv, we interpret the first argument. If its result is a
concrete object whose value satisfies equiv-contexts-p and those
equivalence contexts are a refinement (narrowing) of the interpreter state's
current equiv-contexts, then we set the interpreter state's equiv-contexts
field to that object while interpreting the second argument. This can be used
to (conditionally) exit an unequiv context.
- For fgl-time-fn, we interpret the first argument and use that as a
time$ format specifier for timing the interpretation of the second
argument, returning the result from the second argument.
- For fgl-prog2, we interpret the first argument under the
unequiv equivalence context, discarding the result; then interpret the
second argument under the current equivalence context and return its result.
- For fgl-interp-obj, we first check that we're in an unequiv
equivalence context. Then we interpret the first argument. If its result is a
concrete object whose value satisfies pseudo-term-p, then we interpret
that term and return its result.
Interpreting Function Calls -- Generic Case
Generic function calls are run by fgl-interp-fncall after reducing the
arguments to a list of symbolic objects. This looks up the function mode of the function and, depending on the
restrictions encoded in that mode, may do some or all of the following:
- If all the arguments are explicit, i.e. symbolic objects of the g-concrete kind, then try to execute the function using magic-ev-fncall. If it runs successfuly, return the result as a
g-concrete object.
- Otherwise try applying each of the rules enabled for that function in the
fgl-rewrite-rules table using fgl-rewrite-try-rule. These may be
rewrite, meta, or primitive rules. If any of those rules succeeds, return the
symbolic object produced by recursively interpreting the RHS of the rule under
the unifying substitution.
- Otherwise try executing a primitive associated with that function; see
fgl-primitive-and-meta-rules. If successful, return the value from that primitive.
- Otherwise, if there exists a rule with rune (:definition fnname), or
if there are rewrite/definition rules for that function listed in the
fgl-definition-rules table, then try rewriting the call using those
rules.
- Finally, if none of the above were successful, produce the object
(g-apply fnname args).
This completes the overview of how a term is interpreted and turned into
either a symbolic object (fgl-object) or Boolean formula. Next we
describe three subroutines that we skipped describing above:
fgl-rewrite-try-rule, which attempts to apply a rewrite rule;
fgl-interp-simplify-if-test, which coerces a symbolic object into a Boolean
formula; and fgl-interp-merge-branches, which merges two branches of an IF
test. This will also lead us to discuss fgl-interp-add-constraints, which
adds Boolean constraints according to a set of rules activated when introducing
a new Boolean variable representing some term.
Applying Rewrite Rules
fgl-rewrite-try-rule takes a rule, a function name, and a list of
arguments which are FGL symbolic objects. The rule may be a rewrite, meta, or
primitive rule. For a primitive rule, the primitive function indicated by the
rule is run on the function and arguments, and we return the object it produces
if successful. For a meta rule, the metafunction indicated is run, and if
successful we recursively interpret the term returned under the bindings
returned. For a rewrite rule, we first try to unify the LHS of the rule with
the input function/arguments. If successful, we then try to relieve the hyps
of the rule by calling fgl-interp-test on each one and checking that the
result is (syntactically) constant-true. We also check syntaxp and
bind-free hyps, the latter of which might extend the unifying substitution
with some free variable bindings.
If the hypotheses are all relieved, then we recurs on the conclusion using
fgl-interp-term and return the result unless there were errors recorded in
the interpreter state. Errors are passed up to the caller except for the
special error produced by abort-rewrite, which only causes the current
rule application to fail.
During this process, various helper functions are also called to support
tracing (see fgl-rewrite-tracing) and accumulated-persistence-style
profiling, but none of these are logically relevant.
Application of binder rules is similar, but with slightly different logical
contracts for the rules and very slightly different behavior. For binder
rules, we unify the argument list with the argument list of the LHS of the
rule, omitting the initial argument of the LHS which must be the variable to be
bound. Binder rules also specify the equivalence context under which the RHS
must be rewritten.
Simplifying IF Tests
fgl-interp-simplify-if-test takes a symbolic object and attempts to
reduce it to an IFF-equivalent Boolean formula. For some varieties of symbolic
object, this is trivial: :g-concrete objects' truth value is just the
truth value of the quoted value, g-integer and g-cons objects are
always nonnil, g-map objects are nonnil unless their alist field is
exactly NIL, and g-boolean objects' truth values are given by their
Boolean formulas. This leaves g-var, g-ite, and g-apply objects.
For g-ite objects, we coerce the test sub-object into a Boolean
formula using fgl-interp-simplify-if-test recursively. Then, similar to
symbolic interpretation of IF, we recur on the then object unless the test
formula is syntactically falsified by the path condition, we recur on the
else branch unless the test formula is syntactically true under the path
condition, and we then merge the branches by creating the if-then-else of the
Boolean formulas if both branches were run.
For g-var objects, we assign a fresh Boolean variable to the object,
storing the association in the Boolean variable database (bvar-db) of the
interpreter state, unless the object already has an associated Boolean
variable, in which case we return it.
For g-apply objects, we first rewrite the function call under an IFF
context. In many cases this is redundant, but in some cases it may produce
reductions. If rewriting is successful, we recursively apply
fgl-simplify-if-test to the result. Otherwise, we look up the function
call object in the bvar-db and return the associated Boolean variable, if
any, or else introduce a fresh one and record that association. Finally, if a
new Boolean variable was introduced, we process the object with
fgl-interp-add-constraints (see below) to record any constraints on the new
Boolean variable.
Merging IF Branches
fgl-interp-merge-branches takes a Boolean formula for an IF test and
symbolic objects for the then and else branch values, and returns a new
symbolic object encapsulating the if-then-else.
It first checks for trivial cases -- test constant true, test constant false,
or branches equal -- and returns the obvious results in those cases.
Otherwise, if either branch is a function call (g-apply) object, then it
tries applying branch merge rules for those functions using
fgl-rewrite-try-rule applied to the IF. If any of these are successful, it
returns the result.
Otherwise, if both branches are calls of the same function, it recursively
merges the argument lists and returns the function applied to the merged
arguments. Otherwise, it calls helper function
interp-st-fgl-object-basic-merge, which merges basic symbolic objects
together when their types match, otherwise either producing an
if-then-else (g-ite) object or an error, depending on the configuration
setting of make-ites. (See also fgl-handling-if-then-elses.)
Introducing Boolean Constraints
To do.