Major Section: BDD
The BDD algorithm in ACL2 uses a combination of manipulation of
IF
terms and unconditional rewriting. In this discussion we
begin with some relevant mathematical theory. This is followed by a
description of how ACL2 does BDDs, including concluding discussions
of soundness, completeness, and efficiency.
We recommend that you read the other documentation about BDDs in ACL2 before reading the rather technical material that follows. See BDD.
Here is an outline of our presentation. Readers who want a user perspective, without undue mathematical theory, may wish to skip to Part (B), referring to Part (A) only on occasion if necessary.
(A) Mathematical Considerations
(A1) BDD term order
(A2) BDD-constructors and BDD terms, and their connection with aborting the BDD algorithm
(A3) Canonical BDD terms
(A4) A theorem stating the equivalence of provable and syntactic equality for canonical BDD terms
(B) Algorithmic Considerations
(B1) BDD rules (rules used by the rewriting portion of the ACL2 BDD algorithm)
(B2) Terms ``known to be Boolean''
(B3) An ``IF-lifting'' operation used by the algorithm, as well as an iterative version of that operation
(B4) The ACL2 BDD algorithm
(B5) Soundness and Completeness of the ACL2 BDD algorithm
(B6) Efficiency considerations
(A) Mathematical Considerations
(A1) BDD term order
Our BDD algorithm creates a total ``BDD term order'' on ACL2 terms, on the fly. We use this order in our discussions below of IF-lifting and of canonical BDD terms, and in the algorithm's use of commutativity. The particular order is unimportant, except that we guarantee (for purposes of commutative functions) that constants are smaller in this order than non-constants.
(A2) BDD-constructors (assumed to be '(cons)
) and BDD terms
We take as given a list of function symbols that we call the
``BDD-constructors.'' By default, the only BDD-constructor is
cons
, although it is legal to specify any list of function
symbols as the BDD-constructors, either by using the
acl2-defaults-table (see acl2-defaults-table) or by
supplying a :BDD-CONSTRUCTORS
hint (see hints). Warning:
this capability is largely untested and may produce undesirable
results. Henceforth, except when explicitly stated to the contrary,
we assume that BDD-constructors is '(cons)
.
Roughly speaking, a BDD term is the sort of term produced
by our BDD algorithm, namely a tree with all cons
nodes lying
above all non-CONS
nodes. More formally, a term is said to
be a BDD term if it contains no subterm of either of the
following forms, where f
is not CONS
.
(f ... (CONS ...) ...) (f ... 'x ...) ; where (consp x) = t
We will see that whenever the BDD algorithm attempts to create a term that is not a BDD term, it aborts instead. Thus, whenever the algorithm completes without aborting, it creates a BDD term.
(A3) Canonical BDD terms
We can strengthen the notion of ``BDD term'' to a notion of
``canonical BDD term'' by imposing the following additional
requirements, for every subterm of the form (IF x y z)
:
(a)
x
is a variable, and it precedes (in the BDD term order) every variable occurring iny
orz
;(b)
y
andz
are syntactically distinct; and,(c) it is not the case that
y
ist
andz
isnil
.
We claim that it follows easily from our description of the BDD
algorithm that every term it creates is a canonical BDD term,
assuming that the variables occurring in all such terms are treated
by the algorithm as being Boolean (see (B2) below) and that the
terms contain no function symbols other than IF
and CONS
.
Thus, under those assumptions the following theorem shows that the
BDD algorithm never creates distinct terms that are provably equal,
a property that is useful for completeness and efficiency (as we
explain in (B5) and (B6) below).
(A4) Provably equal canonical BDD terms are identical
We believe that the following theorem and proof are routine
extensions of a standard result and proof to terms that allow calls
of CONS
.
Theorem. Suppose that t1
and t2
are canonical BDD terms
that contain no function symbols other than IF
and CONS
. Also
suppose that (EQUAL t1 t2)
is a theorem. Then t1
and t2
are syntactically identical.
Proof of theorem: By induction on the total number of symbols
occurring in these two terms. First suppose that at least one term
is a variable; without loss of generality let it be t1
. We must
prove that t2
is syntactically the same as t1
. Now it is
clearly consistent that (EQUAL t1 t2)
is false if t2
is a call
of CONS
(to see this, simply let t1
be an value that is not a
CONSP
). Similarly, t2
cannot be a constant or a variable
other than t1
. The remaining possibility to rule out is that
t2
is of the form (IF t3 t4 t5)
, since by assumption its
function symbol must be IF
or CONS
and we have already handled
the latter case. Since t2
is canonical, we know that t3
is a
variable. Since (EQUAL t1 t2)
is provable, i.e.,
(EQUAL t1 (if t3 t4 t5))is provable, it follows that we may substitute either
t
or nil
for t3
into this equality to obtain two new provable equalities.
First, suppose that t1
and t3
are distinct variables. Then
these substitutions show that t1
is provably equal to both t4
and t5
(since t3
does not occur in t4
or t5
by property
(a) above, as t2
is canonical), and hence t4
and t5
are
provably equal to each other, which implies by the inductive
hypothesis that they are the same term -- and this contradicts the
assumption that t2
is canonical (property (b)). Therefore t1
and t3
are the same variable, i.e., the equality displayed above
is actually (EQUAL t1 (if t1 t4 t5))
. Substituting t
and then
nil
for t1
into this provable equality lets us prove (EQUAL t t4)
and (EQUAL nil t5)
, which by the inductive hypothesis implies
that t4
is (syntactically) the term t
and t5
is nil
.
That is, t2
is (IF t1 t nil)
, which contradicts the assumption
that t2
is canonical (property (c)).Next, suppose that at least one term is a call of IF
. Our first
observation is that the other term is also a call of IF
. For if
the other is a call of CONS
, then they cannot be provably equal,
because the former has no function symbols other than IF
and
hence is Boolean when all its variables are assigned Boolean values.
Also, if the other is a constant, then both branches of the IF
term are provably equal to that constant and hence these branches
are syntactically identical by the inductive hypothesis,
contradicting property (b). Hence, we may assume for this case that
both terms are calls of IF
; let us write them as follows.
t0: (IF t1 t2 t3) u0: (IF u1 u2 u3)Note that
t1
and u1
are variables, by property (a) of
canonical BDD terms. First we claim that t1
does not strictly
precede u1
in the BDD term order. For suppose t1
does
strictly precede u1
. Then property (a) of canonical BDD terms
guarantees that t1
does not occur in u0
. Hence, an argument
much like one used above shows that u0
is provably equal to both
t2
(substituting t
for t1
) and t3
(substituting nil
for t1
), and hence t2
and t3
are provably equal. That
implies that they are identical terms, by the inductive hypothesis,
which then contradicts property (b) for t0
. Similarly, u1
does not strictly precede t1
in the BDD term order. Therefore,
t1
and u1
are the same variable. By substituting t
for
this variable we see that t2
and u2
are provably equal, and
hence they are equal by the inductive hypothesis. Similarly, by
substituting nil
for t1
(and u1
) we see that t3
and
u3
are provably, hence syntactically, equal.We have covered all cases in which at least one term is a variable
or at least one term is a call of IF
. If both terms are
constants, then provable and syntactic equality are clearly
equivalent. Finally, then, we may assume that one term is a call of
CONS
and the other is a constant or a call of CONS
. The
constant case is similar to the CONS
case if the constant is a
CONSP
, so we omit it; while if the constant is not a CONSP
then it is not provably equal to a call of CONS
; in fact it is
provably not equal!
So, we are left with a final case, in which canonical BDD terms
(CONS t1 t2)
and (CONS u1 u2)
are provably equal, and we want
to show that t1
and u1
are syntactically equal as are t2
and u2
. These conclusions are easy consequences of the inductive
hypothesis, since the ACL2 axiom CONS-EQUAL
(which you can
inspect using :
PE
) shows that equality of the given terms is
equivalent to the conjunction of (EQUAL t1 t2)
and (EQUAL u1 u2)
.
Q.E.D.
(B) Algorithmic Considerations
(B1) BDD rules
A rule of class :
rewrite
(see rule-classes) is said to be
a ``BDD rewrite rule'' if and only if it satisfies the
following criteria. (1) The rule is enabled. (2) Its
equivalence relation is equal
. (3) It has no hypotheses.
(4) Its :
loop-stopper
field is nil
, i.e., it is not a
permutative rule. (5) All variables occurring in the rule occur in
its left-hand side (i.e., there are no ``free variables'';
see rewrite). A rule of class :
definition
(see rule-classes) is said to be a ``BDD definition rule''
if it satisfies all the criteria above (except (4), which does not
apply), and moreover the top function symbol of the left-hand side
was not recursively (or mutually recursively) defined. Technical
point: Note that this additional criterion is independent of
whether or not the indicated function symbol actually occurs in the
right-hand side of the rule.
Both BDD rewrite rules and BDD definition rules are said to be ``BDD rules.''
(B2) Terms ''known to be Boolean''
We apply the BDD algorithm in the context of a top-level goal to
prove, namely, the goal at which the :BDD
hint is attached. As
we run the BDD algorithm, we allow ourselves to say that a set of
terms is ``known to be Boolean'' if we can verify that the goal
is provable from the assumption that at least one of the terms is
not Boolean. Equivalently, we allow ourselves to say that a set of
terms is ``known to be Boolean'' if we can verify that the original
goal is provably equivalent to the assertion that if all terms in
the set are Boolean, then the goal holds. The notion ``known to be
Boolean'' is conservative in the sense that there are generally sets
of terms for which the above equivalent criteria hold and yet the
sets of terms are not noted as as being ``known to be Boolean.''
However, ACL2 uses a number of tricks, including type-set
reasoning and analysis of the structure of the top-level goal, to
attempt to establish that a sufficiently inclusive set of terms is
known to be Boolean.
From a practical standpoint, the algorithm determines a set of terms
known to be Boolean; we allow ourselves to say that each term in
this set is ``known to be Boolean.'' The algorithm assumes that
these terms are indeed Boolean, and can make use of that assumption.
For example, if t1
is known to be Boolean then the algorithm
simplifies (IF t1 t nil)
to t1
; see (iv) in the discussion
immediately below.
(B3) IF-lifting and the IF-lifting-for-IF loop
Suppose that one has a term of the form (f ... (IF test x y) ...)
,
where f
is a function symbol other than CONS
. Then we say
that ``IF-lifting'' test
``from'' this term produces the
following term, which is provably equal to the given term.
(if test (f ... x ...) ; resulting true branch (f ... y ...)) ; resulting false branchHere, we replace each argument of
f
of the form (IF test .. ..)
,
for the same test
, in the same way. In this case we say that
``IF-lifting applies to'' the given term, ``yielding the test''
test
and with the ``resulting two branches'' displayed above.
Whenever we apply IF-lifting, we do so for the available test
that is least in the BDD term order (see (A1) above).We consider arguments v
of f
that are ``known to be Boolean''
(see above) to be replaced by (IF v t nil)
for the purposes of
IF-lifting, i.e., before IF-lifting is applied.
There is one special case, however, for IF-lifting. Suppose that
the given term is of the form (IF v y z)
where v
is a variable
and is the test to be lifted out (i.e., it is least in the BDD term
order among the potential tests). Moroever, suppose that neither
y
nor z
is of the form (IF v W1 W2)
for that same v
.
Then IF-lifting does not apply to the given term.
We may now describe the IF-lifting-for-IF loop, which applies to
terms of the form (IF test tbr fbr)
where the algorithm has
already produced test
, tbr
, and fbr
. First, if test
is
nil
then we return fbr
, while if test
is a non-nil
constant or a call of CONS
then we return tbr
. Otherwise, we
see if IF-lifting applies. If IF-lifting does not apply, then we
return (IF test tbr fbr)
. Otherwise, we apply IF-lifting to
obtain a term of the form (IF x y z)
, by lifting out the
appropriate test. Now we recursively apply the IF-lifting-for-IF
loop to the term (IF x y z)
, unless any of the following special
cases apply.
(i) If
y
andz
are the same term, then returny
.(ii) Otherwise, if
x
andz
are the same term, then replacez
bynil
before recursively applying IF-lifting-for-IF.(iii) Otherwise, if
x
andy
are the same term andy
is known to be Boolean, then replacey
byt
before recursively applying IF-lifting-for-IF.(iv) If
z
isnil
and eitherx
andy
are the same term orx
is ``known to be Boolean'' andy
ist
, then returnx
.
NOTE: When a variable x
is known to be Boolean, it is easy to
see that the form (IF x t nil)
is always reduced to x
by this
algorithm.
(B4) The ACL2 BDD algorithm
We are now ready to present the BDD algorithm for ACL2. It is given
an ACL2 term, x
, as well as an association list va
that
maps variables to terms, including all variables occurring in x
.
We maintain the invariant that whenever a variable is mapped by
va
to a term, that term has already been constructed by the
algorithm, except: initially va
maps every variable occurring in
the top-level term to itself. The algorithm proceeds as follows.
We implicitly ordain that whenever the BDD algorithm attempts to
create a term that is not a BDD term (as defined above in
(A2)), it aborts instead. Thus, whenever the algorithm completes
without aborting, it creates a BDD term.
If
x
is a variable, return the result of looking it up inva
.If
x
is a constant, returnx
.If
x
is of the form(IF test tbr fbr)
, then first run the algorithm ontest
with the givenva
to obtaintest'
. Iftest'
isnil
, then return the resultfbr'
of running the algorithm onfbr
with the givenva
. Iftest'
is a constant other thannil
, or is a call ofCONS
, then return the resulttbr'
of running the algorithm ontbr
with the givenva
. Iftbr
is identical tofbr
, returntbr
. Otherwise, return the result of applying the IF-lifting-for-IF loop (described above) to the term(IF test' tbr' fbr')
.If
x
is of the form(IF* test tbr fbr)
, then compute the result exactly as thoughIF
were used rather thanIF*
, except that iftest'
is not a constant or a call ofCONS
(see paragraph above), then abort the BDD computation. Informally, the tests ofIF*
terms are expected to ``resolve.'' NOTE: This description shows howIF*
can be used to implement conditional rewriting in the BDD algorithm.If
x
is aLAMBDA
expression((LAMBDA vars body) . args)
(which often corresponds to aLET
term; see let), then first form an alistva'
by binding eachv
invars
to the result of running the algorithm on the corresponding member ofargs
, with the current alistva
. Then, return the result of the algorithm onbody
in the alistva'
.Otherwise,
x
is of the form(f x1 x2 ... xn)
, wheref
is a function symbol other thanIF
orIF*
. In that case, letxi'
be the result of running the algorithm onxi
, fori
from 1 ton
, using the given alistva
. First there are a few special cases. Iff
isEQUAL
then we returnt
ifx1'
is syntactically identical tox2'
(where this test is very fast; see (B6) below); we returnx1'
if it is known to be Boolean andx2'
ist
; and similarly, we returnx2'
if it is known to be Boolean andx1'
ist
. Next, if eachxi'
is a constant and the:
executable-counterpart
off
is enabled, then the result is obtained by computation. Next, iff
isBOOLEANP
andx1'
is known to be Boolean,t
is returned. Otherwise, we proceed as follows, first possibly swapping the arguments if they are out of (the BDD term) order and iff
is known to be commutative (see below). If a BDD rewrite rule (as defined above) matches the term(f x1'... xn')
, then the most recently stored such rule is applied. If there is no such match andf
is a BDD-constructor, then we return(f x1'... xn')
. Otherwise, if a BDD definition rule matches this term, then the most recently stored such rule (which will usually be the original definition for most users) is applied. If none of the above applies and neither does IF-lifting, then we return(f x1'... xn')
. Otherwise we apply IF-lifting to(f x1'... xn')
to obtain a term(IF test tbr fbr)
; but we aren't done yet. Rather, we run the BDD algorithm (using the same alist) ontbr
andfbr
to obtain termstbr'
andfbr'
, and we return(IF test tbr' fbr')
unlesstbr'
is syntactically identical tofbr'
, in which case we returntbr'
.
When is it the case that, as said above, ``f
is known to be
commutative''? This happens when an enabled rewrite rule is of the
form (EQUAL (f X Y) (f Y X))
. Regarding swapping the arguments
in that case: recall that we may assume very little about the BDD
term order, essentially only that we swap the two arguments when the
second is a constant and the first is not, for example, in (+ x 1)
.
Other than that situation, one cannot expect to predict accurately
when the arguments of commutative operators will be swapped.
(B5) Soundness and Completeness of the ACL2 BDD algorithm
Roughly speaking, ``soundness'' means that the BDD algorithm should give correct answers, and ``completeness'' means that it should be powerful enough to prove all true facts. Let us make the soundness claim a little more precise, and then we'll address completeness under suitable hypotheses.
Claim (Soundness). If the ACL2 BDD algorithm runs to
completion on an input term t0
, then it produces a result that is
provably equal to t0
.
We leave the proof of this claim to the reader. The basic idea is simply to check that each step of the algorithm preserves the meaning of the term under the bindings in the given alist.
Let us start our discussion of completeness by recalling the theorem proved above in (A4).
Theorem. Suppose that t1
and t2
are canonical BDD terms
that contain no function symbols other than IF
and CONS
. Also
suppose that (EQUAL t1 t2)
is a theorem. Then t1
and t2
are syntactically identical.
Below we show how this theorem implies the following completeness
property of the ACL2 BDD algorithm. We continue to assume that
CONS
is the only BDD-constructor.
Claim (Completeness). Suppose that t1
and t2
are
provably equal terms, under the assumption that all their variables
are known to be Boolean. Assume further that under this same
assumption, top-level runs of the ACL2 BDD algorithm on these terms
return terms that contain only the function symbols IF
and
CONS
. Then the algorithm returns the same term for both t1
and t2
, and the algorithm reduces (EQUAL t1 t2)
to t
.
Why is this claim true? First, notice that the second part of the
conclusion follows immediately from the first, by definition of the
algorithm. Next, notice that the terms u1
and u2
obtained by
running the algorithm on t1
and t2
, respectively, are provably
equal to t1
and t2
, respectively, by the Soundness Claim. It
follows that u1
and u2
are provably equal to each other.
Since these terms contain no function symbols other than IF
or
CONS
, by hypothesis, the Claim now follows from the Theorem above
together with the following lemma.
Lemma. Suppose that the result of running the ACL2 BDD
algorithm on a top-level term t0
is a term u0
that contains
only the function symbols IF
and CONS
, where all variables of
t0
are known to be Boolean. Then u0
is a canonical BDD term.
Proof: left to the reader. Simply follow the definition of the algorithm, with a separate argument for the IF-lifting-for-IF loop.
Finally, let us remark on the assumptions of the Completeness Claim
above. The assumption that all variables are known to be Boolean is
often true; in fact, the system uses the forward-chaining rule
boolean-listp-forward
(you can see it using :
pe
) to try to
establish this assumption, if your theorem has a form such as the
following.
(let ((x (list x0 x1 ...)) (y (list y0 y1 ...))) (implies (and (boolean-listp x) (boolean-listp y)) ...))Moreover, the
:BDD
hint can be used to force the prover to abort
if it cannot check that the indicated variables are known to be
Boolean; see hints.Finally, consider the effect in practice of the assumption that the
terms resulting from application of the algorithm contain calls of
IF
and CONS
only. Typical use of BDDs in ACL2 takes place in
a theory (see theories) in which all relevant non-recursive
function symbols are enabled and all recursive function symbols
possess enabled BDD rewrite rules that tell them how open up. For
example, such a rule may say how to expand on a given function
call's argument that has the form (CONS a x)
, while another may
say how to expand when that argument is nil
). (See for example
the rules append-cons
and append-nil
in the documentation for
IF*
.) We leave it to future work to formulate a theorem that
guarantees that the BDD algorithm produces terms containing calls
only of IF
and CONS
assuming a suitably ``complete''
collection of rewrite rules.
(B6) Efficiency considerations
Following Bryant's algorithm, we use a graph representation of terms created by the BDD algorithm's computation. This representation enjoys some important properties.
(Time efficiency) The test for syntactic equality of BDD terms is very fast.
(Space efficiency) Equal BDD data structures are stored identically in memory.
Implementation note. The representation actually uses a sort of hash table for BDD terms that is implemented as an ACL2 1-dimensional array. See arrays. In addition, we use a second such hash table to avoid recomputing the result of applying a function symbol to the result of running the algorithm on its arguments. We believe that these uses of hash tables are standard. They are also discussed in Moore's paper on BDDs; see bdd for the reference.