Bdd-algorithm
Summary of the BDD algorithm in ACL2
The BDD algorithm in ACL2 uses a combination of manipulation of
IF terms and unconditional rewriting. In this discussion we begin with
some relevant mathematical theory. This is followed by a description of how
ACL2 does BDDs, including concluding discussions of soundness, completeness,
and efficiency.
We recommend that you read the other documentation about BDDs in ACL2
before reading the rather technical material that follows. See bdd.
Here is an outline of our presentation. Readers who want a user
perspective, without undue mathematical theory, may wish to skip to Part (B),
referring to Part (A) only on occasion if necessary.
(A) Mathematical Considerations
(A1) BDD term order
(A2) BDD-constructors and BDD terms, and their connection with aborting the
BDD algorithm
(A3) Canonical BDD terms
(A4) A theorem stating the equivalence of provable and syntactic equality
for canonical BDD terms
(B) Algorithmic Considerations
(B1) BDD rules (rules used by the rewriting portion of the ACL2 BDD
algorithm)
(B2) Terms ``known to be Boolean''
(B3) An ``IF-lifting'' operation used by the algorithm, as well as an
iterative version of that operation
(B4) The ACL2 BDD algorithm
(B5) Soundness and Completeness of the ACL2 BDD algorithm
(B6) Efficiency considerations
(A) Mathematical Considerations
(A1) BDD term order
Our BDD algorithm creates a total ``BDD term order'' on ACL2 terms, on the
fly. We use this order in our discussions below of IF-lifting and of
canonical BDD terms, and in the algorithm's use of commutativity. The
particular order is unimportant, except that we guarantee (for purposes of
commutative functions) that constants are smaller in this order than
non-constants.
(A2) BDD-constructors (assumed to be '(cons)) and BDD
terms
We take as given a list of function symbols that we call the
``BDD-constructors.'' By default, the only BDD-constructor is cons,
although it is legal to specify any list of function symbols as the
BDD-constructors, either by using the ACL2-defaults-table (see ACL2-defaults-table) or by supplying a :BDD-CONSTRUCTORS hint (see hints). Warning: this capability is largely untested and may produce
undesirable results. Henceforth, except when explicitly stated to the
contrary, we assume that BDD-constructors is '(cons).
Roughly speaking, a bdd term is the sort of term produced by
our BDD algorithm, namely a tree with all cons nodes lying above all
non-CONS nodes. More formally, a term is said to be a bdd
term if it contains no subterm of either of the following forms, where
f is not CONS.
(f ... (CONS ...) ...)
(f ... 'x ...) ; where (consp x) = t
We will see that whenever the BDD algorithm attempts to create a term that is not a bdd term, it aborts instead. Thus, whenever the
algorithm completes without aborting, it creates a bdd term.
(A3) Canonical BDD terms
We can strengthen the notion of ``BDD term'' to a notion of ``canonical BDD
term'' by imposing the following additional requirements, for every subterm of
the form (IF x y z):
(a) x is a variable, and it precedes (in the BDD term
order) every variable occurring in y or z;
(b) y and z are syntactically distinct; and,
(c) it is not the case that y is t and z is
nil.
We claim that it follows easily from our description of the BDD algorithm
that every term it creates is a canonical BDD term, assuming that the
variables occurring in all such terms are treated by the algorithm as being
Boolean (see (B2) below) and that the terms contain no function symbols other
than IF and CONS. Thus, under those assumptions the following
theorem shows that the BDD algorithm never creates distinct terms that are
provably equal, a property that is useful for completeness and efficiency (as
we explain in (B5) and (B6) below).
(A4) Provably equal canonical BDD terms are identical
We believe that the following theorem and proof are routine extensions of a
standard result and proof to terms that allow calls of CONS.
Theorem. Suppose that t1 and t2 are canonical BDD terms
that contain no function symbols other than IF and CONS. Also
suppose that (EQUAL t1 t2) is a theorem. Then t1 and t2 are
syntactically identical.
Proof of theorem: By induction on the total number of symbols occurring in
these two terms. First suppose that at least one term is a variable; without
loss of generality let it be t1. We must prove that t2 is
syntactically the same as t1. Now it is clearly consistent that
(EQUAL t1 t2) is false if t2 is a call of CONS (to see this,
simply let t1 be an value that is not a CONSP). Similarly, t2
cannot be a constant or a variable other than t1. The remaining
possibility to rule out is that t2 is of the form (IF t3 t4 t5),
since by assumption its function symbol must be IF or CONS and we
have already handled the latter case. Since t2 is canonical, we know
that t3 is a variable. Since (EQUAL t1 t2) is provable, i.e.,
(EQUAL t1 (if t3 t4 t5))
is provable, it follows that we may substitute either t or nil
for t3 into this equality to obtain two new provable equalities. First,
suppose that t1 and t3 are distinct variables. Then these
substitutions show that t1 is provably equal to both t4 and t5
(since t3 does not occur in t4 or t5 by property (a) above, as
t2 is canonical), and hence t4 and t5 are provably equal to
each other, which implies by the inductive hypothesis that they are the same
term — and this contradicts the assumption that t2 is canonical
(property (b)). Therefore t1 and t3 are the same variable, i.e.,
the equality displayed above is actually (EQUAL t1 (if t1 t4 t5)).
Substituting t and then nil for t1 into this provable equality
lets us prove (EQUAL t t4) and (EQUAL nil t5), which by the
inductive hypothesis implies that t4 is (syntactically) the term t
and t5 is nil. That is, t2 is (IF t1 t nil), which
contradicts the assumption that t2 is canonical (property (c)).
Next, suppose that at least one term is a call of IF. Our first
observation is that the other term is also a call of IF. For if the
other is a call of CONS, then they cannot be provably equal, because the
former has no function symbols other than IF and hence is Boolean when
all its variables are assigned Boolean values. Also, if the other is a
constant, then both branches of the IF term are provably equal to that
constant and hence these branches are syntactically identical by the inductive
hypothesis, contradicting property (b). Hence, we may assume for this case
that both terms are calls of IF; let us write them as follows.
t0: (IF t1 t2 t3)
u0: (IF u1 u2 u3)
Note that t1 and u1 are variables, by property (a) of canonical
BDD terms. First we claim that t1 does not strictly precede u1 in
the BDD term order. For suppose t1 does strictly precede u1. Then
property (a) of canonical BDD terms guarantees that t1 does not occur in
u0. Hence, an argument much like one used above shows that u0 is
provably equal to both t2 (substituting t for t1) and t3
(substituting nil for t1), and hence t2 and t3 are
provably equal. That implies that they are identical terms, by the inductive
hypothesis, which then contradicts property (b) for t0. Similarly,
u1 does not strictly precede t1 in the BDD term order. Therefore,
t1 and u1 are the same variable. By substituting t for this
variable we see that t2 and u2 are provably equal, and hence they
are equal by the inductive hypothesis. Similarly, by substituting nil
for t1 (and u1) we see that t3 and u3 are provably, hence
syntactically, equal.
We have covered all cases in which at least one term is a variable or at
least one term is a call of IF. If both terms are constants, then
provable and syntactic equality are clearly equivalent. Finally, then, we may
assume that one term is a call of CONS and the other is a constant or a
call of CONS. The constant case is similar to the CONS case if the
constant is a CONSP, so we omit it; while if the constant is not a
CONSP then it is not provably equal to a call of CONS; in fact it is
provably not equal!
So, we are left with a final case, in which canonical BDD terms (CONS t1
t2) and (CONS u1 u2) are provably equal, and we want to show that
t1 and u1 are syntactically equal as are t2 and u2. These
conclusions are easy consequences of the inductive hypothesis, since the ACL2
axiom CONS-EQUAL (which you can inspect using :pe) shows
that equality of the given terms is equivalent to the conjunction of (EQUAL
t1 t2) and (EQUAL u1 u2). Q.E.D.
(B) Algorithmic Considerations
(B1) BDD rules
A rule of class :rewrite (see rule-classes) is said to
be a ``bdd rewrite rule'' if and only if it satisfies the following
criteria. (1) The rule is enabled. (2) Its equivalence
relation is equal. (3) It has no hypotheses. (4) Its :loop-stopper field is nil, i.e., it is not a permutative rule. (5) All
variables occurring in the rule occur in its left-hand side (i.e., there are
no ``free variables''; see rewrite). A rule of class :definition (see rule-classes) is said to be a ``bdd definition
rule'' if it satisfies all the criteria above (except (4), which does not
apply), and moreover the top function symbol of the left-hand side was not
recursively (or mutually recursively) defined. Technical point: Note that
this additional criterion is independent of whether or not the indicated
function symbol actually occurs in the right-hand side of the rule.
Both BDD rewrite rules and BDD definition rules are said to be ``BDD
rules.''
(B2) Terms ''known to be Boolean''
We apply the BDD algorithm in the context of a top-level goal to prove,
namely, the goal at which the :BDD hint is attached. As we run the BDD
algorithm, we allow ourselves to say that a set of terms is ``known to
be Boolean'' if we can verify that the goal is provable from the assumption
that at least one of the terms is not Boolean. Equivalently, we allow
ourselves to say that a set of terms is ``known to be Boolean'' if we can
verify that the original goal is provably equivalent to the assertion that if
all terms in the set are Boolean, then the goal holds. The notion ``known to
be Boolean'' is conservative in the sense that there are generally sets of
terms for which the above equivalent criteria hold and yet the sets of terms
are not noted as as being ``known to be Boolean.'' However, ACL2 uses a
number of tricks, including type-reasoning and analysis of the
structure of the top-level goal, to attempt to establish that a sufficiently
inclusive set of terms is known to be Boolean.
From a practical standpoint, the algorithm determines a set of terms known
to be Boolean; we allow ourselves to say that each term in this set is ``known
to be Boolean.'' The algorithm assumes that these terms are indeed Boolean,
and can make use of that assumption. For example, if t1 is known to be
Boolean then the algorithm simplifies (IF t1 t nil) to t1; see (iv)
in the discussion immediately below.
(B3) IF-lifting and the IF-lifting-for-IF loop
Suppose that one has a term of the form (f ... (IF test x y)
...), where f is a function symbol other than CONS. Then we say
that ``IF-lifting'' test ``from'' this term produces the following term,
which is provably equal to the given term.
(if test
(f ... x ...) ; resulting true branch
(f ... y ...)) ; resulting false branch
Here, we replace each argument of f of the form (IF test .. ..),
for the same test, in the same way. In this case we say that
``IF-lifting applies to'' the given term, ``yielding the test'' test and
with the ``resulting two branches'' displayed above. Whenever we apply
IF-lifting, we do so for the available test that is least in the BDD term
order (see (A1) above).
We consider arguments v of f that are ``known to be Boolean''
(see above) to be replaced by (IF v t nil) for the purposes of
IF-lifting, i.e., before IF-lifting is applied.
There is one special case, however, for IF-lifting. Suppose that the given
term is of the form (IF v y z) where v is a variable and is the test
to be lifted out (i.e., it is least in the BDD term order among the potential
tests). Moreover, suppose that neither y nor z is of the form
(IF v W1 W2) for that same v. Then IF-lifting does not apply to the
given term.
We may now describe the IF-lifting-for-IF loop, which applies to terms of
the form (IF test tbr fbr) where the algorithm has already produced
test, tbr, and fbr. First, if test is nil then we
return fbr, while if test is a non-nil constant or a call of
CONS then we return tbr. Otherwise, we see if IF-lifting applies.
If IF-lifting does not apply, then we return (IF test tbr fbr).
Otherwise, we apply IF-lifting to obtain a term of the form (IF x y z),
by lifting out the appropriate test. Now we recursively apply the
IF-lifting-for-IF loop to the term (IF x y z), unless any of the
following special cases apply.
(i) If y and z are the same term, then return
y.
(ii) Otherwise, if x and z are the same term, then replace z
by nil before recursively applying IF-lifting-for-IF.
(iii) Otherwise, if x and y are the same term and y is known
to be Boolean, then replace y by t before recursively applying
IF-lifting-for-IF.
(iv) If z is nil and either x and y are the same term
or x is ``known to be Boolean'' and y is t, then return
x.
NOTE: When a variable x is known to be Boolean, it is easy to see that
the form (IF x t nil) is always reduced to x by this algorithm.
(B4) The ACL2 BDD algorithm
We are now ready to present the BDD algorithm for ACL2. It is given an
ACL2 term, x, as well as an association list va that maps
variables to terms, including all variables occurring in x. We maintain
the invariant that whenever a variable is mapped by va to a term, that
term has already been constructed by the algorithm, except: initially va
maps every variable occurring in the top-level term to itself. The algorithm
proceeds as follows. We implicitly ordain that whenever the BDD algorithm
attempts to create a term that is not a bdd term (as defined
above in (A2)), it aborts instead. Thus, whenever the algorithm completes
without aborting, it creates a bdd term.
If x is a variable, return the result of looking it up in va.
If x is a constant, return x.
If x is of the form (IF test tbr fbr), then first run the
algorithm on test with the given va to obtain test'. If
test' is nil, then return the result fbr' of running the
algorithm on fbr with the given va. If test' is a constant
other than nil, or is a call of CONS, then return the result
tbr' of running the algorithm on tbr with the given va. If
tbr is identical to fbr, return tbr. Otherwise, return the
result of applying the IF-lifting-for-IF loop (described above) to the term
(IF test' tbr' fbr').
If x is of the form (IF* test tbr fbr), then compute the result
exactly as though if were used rather than if*, except that if
test' is not a constant or a call of CONS (see paragraph above),
then abort the BDD computation. Informally, the tests of if* terms
are expected to ``resolve.'' NOTE: This description shows how if* can
be used to implement conditional rewriting in the BDD algorithm.
If x is a LAMBDA expression ((LAMBDA vars body) . args)
(which often corresponds to a let term; see let), then first
form an alist va' by binding each v in vars to the result of
running the algorithm on the corresponding member of args, with the
current alist va. Then, return the result of the algorithm on body
in the alist va'.
Otherwise, x is of the form (f x1 x2 ... xn), where f is a
function symbol other than if or if*. In that case, let
xi' be the result of running the algorithm on xi, for i from 1
to n, using the given alist va. First there are a few special
cases. If f is equal then we return t if x1' is
syntactically identical to x2' (where this test is very fast; see (B6)
below); we return x1' if it is known to be Boolean and x2' is
t; and similarly, we return x2' if it is known to be Boolean and
x1' is t. Next, if each xi' is a constant and the :executable-counterpart of f is enabled, then the result is obtained by
computation. Next, if f is booleanp and x1' is known to be
Boolean, t is returned. Otherwise, we proceed as follows, first possibly
swapping the arguments if they are out of (the BDD term) order and if f
is known to be commutative (see below). If a BDD rewrite rule (as defined
above) matches the term (f x1'... xn'), then the most recently stored
such rule is applied. If there is no such match and f is a
BDD-constructor, then we return (f x1'... xn'). Otherwise, if a BDD
definition rule matches this term, then the most recently stored such rule
(which will usually be the original definition for most users) is applied. If
none of the above applies and neither does IF-lifting, then we return (f
x1'... xn'). Otherwise we apply IF-lifting to (f x1'... xn') to
obtain a term (IF test tbr fbr); but we aren't done yet. Rather, we run
the BDD algorithm (using the same alist) on tbr and fbr to obtain
terms tbr' and fbr', and we return (IF test tbr' fbr')
unless tbr' is syntactically identical to fbr', in which case we
return tbr'.
When is it the case that, as said above, ``f is known to be
commutative''? This happens when an enabled rewrite rule is of the form
(EQUAL (f X Y) (f Y X)). Regarding swapping the arguments in that case:
recall that we may assume very little about the BDD term order, essentially
only that we swap the two arguments when the second is a constant and the
first is not, for example, in (+ x 1). Other than that situation, one
cannot expect to predict accurately when the arguments of commutative
operators will be swapped.
(B5) Soundness and Completeness of the ACL2 BDD algorithm
Roughly speaking, ``soundness'' means that the BDD algorithm should give
correct answers, and ``completeness'' means that it should be powerful enough
to prove all true facts. Let us make the soundness claim a little more
precise, and then we'll address completeness under suitable hypotheses.
Claim (Soundness). If the ACL2 BDD algorithm runs to completion on
an input term t0, then it produces a result that is provably equal to
t0.
We leave the proof of this claim to the reader. The basic idea is simply
to check that each step of the algorithm preserves the meaning of the term
under the bindings in the given alist.
Let us start our discussion of completeness by recalling the theorem proved
above in (A4).
Theorem. Suppose that t1 and t2 are canonical BDD terms
that contain no function symbols other than IF and CONS. Also
suppose that (EQUAL t1 t2) is a theorem. Then t1 and t2 are
syntactically identical.
Below we show how this theorem implies the following completeness property
of the ACL2 BDD algorithm. We continue to assume that CONS is the only
BDD-constructor.
Claim (Completeness). Suppose that t1 and t2 are provably
equal terms, under the assumption that all their variables are known to be
Boolean. Assume further that under this same assumption, top-level runs of
the ACL2 BDD algorithm on these terms return terms that contain only the
function symbols IF and CONS. Then the algorithm returns the same
term for both t1 and t2, and the algorithm reduces (EQUAL t1
t2) to t.
Why is this claim true? First, notice that the second part of the
conclusion follows immediately from the first, by definition of the algorithm.
Next, notice that the terms u1 and u2 obtained by running the
algorithm on t1 and t2, respectively, are provably equal to t1
and t2, respectively, by the Soundness Claim. It follows that u1
and u2 are provably equal to each other. Since these terms contain no
function symbols other than IF or CONS, by hypothesis, the Claim now
follows from the Theorem above together with the following lemma.
Lemma. Suppose that the result of running the ACL2 BDD algorithm on
a top-level term t0 is a term u0 that contains only the function
symbols IF and CONS, where all variables of t0 are known to be
Boolean. Then u0 is a canonical BDD term.
Proof: left to the reader. Simply follow the definition of the algorithm,
with a separate argument for the IF-lifting-for-IF loop.
Finally, let us remark on the assumptions of the Completeness Claim above.
The assumption that all variables are known to be Boolean is often true; in
fact, the system uses the forward-chaining rule boolean-listp-forward
(you can see it using :pe) to try to establish this assumption,
if your theorem has a form such as the following.
(let ((x (list x0 x1 ...))
(y (list y0 y1 ...)))
(implies (and (boolean-listp x)
(boolean-listp y))
...))
Moreover, the :BDD hint can be used to force the prover to abort if it
cannot check that the indicated variables are known to be Boolean; see hints.
Finally, consider the effect in practice of the assumption that the terms
resulting from application of the algorithm contain calls of IF and
CONS only. Typical use of BDDs in ACL2 takes place in a theory (see
theories) in which all relevant non-recursive function symbols are
enabled and all recursive function symbols possess enabled BDD rewrite rules
that tell them how open up. For example, such a rule may say how to expand on
a given function call's argument that has the form (CONS a x), while
another may say how to expand when that argument is nil). (See for
example the rules append-cons and append-nil in the documentation
for if*.) We leave it to future work to formulate a theorem that
guarantees that the BDD algorithm produces terms containing calls only of
IF and CONS assuming a suitably ``complete'' collection of rewrite
rules.
(B6) Efficiency considerations
Following Bryant's algorithm, we use a graph representation of terms
created by the BDD algorithm's computation. This representation enjoys some
important properties.
(Time efficiency) The test for syntactic equality of BDD terms is very
fast.
(Space efficiency) Equal BDD data structures are stored identically in
memory.
Implementation note. The representation actually uses a sort of hash
table for BDD terms that is implemented as an ACL2 1-dimensional array. See
arrays. In addition, we use a second such hash table to avoid
recomputing the result of applying a function symbol to the result of running
the algorithm on its arguments. We believe that these uses of hash tables are
standard. They are also discussed in Moore's paper on BDDs; see bdd
for the reference.