Major Section: RULE-CLASSES
See rule-classes for a general discussion of rule classes and
how they are used to build rules from formulas. An example
:
corollary
formula from which a :forward-chaining
rule might be
built is:
Example: (implies (and (p x) (r x)) when (p a) appears in a formula to be (q (f x))) simplified, try to establish (r a) and if successful, add (q (f a)) to the known assumptionsTo specify the triggering terms provide a non-empty list of terms as the value of the
:trigger-terms
field of the rule class object.
General Form: Any theorem, provided an acceptable triggering term exists.Forward chaining rules are generated from the corollary term as follows. If that term has the form
(implies hyp concl)
, then the
let
expressions in concl
(formally, lambda expressions) are expanded
away, and the result is treated as a conjunction, with one forward
chaining rule with hypothesis hyp created for each conjunct. In the
other case, where the corollary term is not an implies
, we process
it as we process the conclusion in the first case.
The :trigger-terms
field of a :forward-chaining
rule class object
should be a non-empty list of terms, if provided, and should have
certain properties described below. If the :trigger-terms
field is
not provided, it defaults to the singleton list containing the
``atom'' of the first hypothesis of the formula. (The atom of
(not x)
is x
; the atom of any other term is the term itself.) If
there are no hypotheses and no :trigger-terms
were provided, an
error is caused.
A triggering term is acceptable if it is not a variable, a quoted
constant, a lambda application, a let
-expression, or a
not
-expression, and every variable symbol in the conclusion of the
theorem either occurs in the hypotheses or occurs in the trigger.
:Forward-chaining
rules are used by the simplifier before it begins
to rewrite the literals of the goal. If any term in the goal is an
instance of a trigger of some forward chaining rule, we try to
establish the hypotheses of that forward chaining theorem (from the
negation of the goal). To relieve a hypotheses we only use type
reasoning, evaluation of ground terms, and presence among our known
assumptions. We do not use rewriting. If all hypotheses are
relieved, we add the instantiated conclusion to our known
assumptions.
Major Section: RULE-CLASSES
See rule-classes for a general discussion of rule classes and
how they are used to build rules from formulas. An example
:
corollary
formula from which a :generalize
rule might be built is:
Example: any theoremTo use such aGeneral Form: any theorem
:generalize
rule, the system waits until it has
decided to generalize some term, term
, by replacing it with some new
variable v
. If any :generalize
formula can be instantiated so that
some non-variable subterm becomes term
, then that instance of the
formula is added as a hypothesis.
At the moment, the best description of how ACL2 :generalize
rules
are used may be found in the discussion of ``Generalize Rules,'' pp
248 of A Computational Logic Handbook.
Major Section: RULE-CLASSES
Example: (:induction :corollary t ; the theorem proved is irrelevant! :pattern (* 1/2 i) :condition (and (integerp i) (>= i 0)) :scheme (recursion-by-sub2 i))
In ACL2, as in Nqthm, the functions in a conjecture ``suggest'' the inductions considered by the system. Because every recursive function must be admitted with a justification in terms of a measure that decreases in a well-founded way on a given set of ``controlling'' arguments, every recursive function suggests a dual induction scheme that ``unwinds'' the function from a given application.
For example, since append
(actually binary-append
, but we'll ignore
the distinction here) decomposes its first argument by successive
cdr
s as long as it is a non-nil
true list, the induction scheme
suggested by (append x y)
has a base case supposing x
to be either
not a true list or to be nil
and then has an induction step in which
the induction hypothesis is obtained by replacing x
by (cdr x)
.
This substitution decreases the same measure used to justify the
definition of append
. Observe that an induction scheme is suggested
by a recursive function application only if the controlling actuals
are distinct variables, a condition that is sufficient to insure
that the ``substitution'' used to create the induction hypothesis is
indeed a substitution and that it drives down a certain measure. In
particular, (append (foo x) y)
does not suggest an induction
unwinding append
because the induction scheme suggested by
(append x y)
requires that we substitute (cdr x)
for x
and
we cannot do that if x
is not a variable symbol.
Once ACL2 has collected together all the suggested induction schemes it massages them in various ways, combining some to simultaneously unwind certain cliques of functions and vetoing others because they ``flaw'' others. We do not further discuss the induction heuristics here; the interested reader should see Chapter XIV of A Computational Logic (Boyer and Moore, Academic Press, 1979) which represents a fairly complete description of the induction heuristics of ACL2.
However, unlike Nqthm, ACL2 provides a means by which the user can
elaborate the rules under which function applications suggest
induction schemes. Such rules are called :induction
rules. The
definitional principle automatically creates an :induction
rule,
named (:induction fn)
, for each admitted recursive function, fn
. It
is this rule that links applications of fn
to the induction scheme
it suggests. Disabling (:induction fn)
will prevent fn
from
suggesting the induction scheme derived from its recursive
definition. It is possible for the user to create additional
:induction
rules by using the :induction
rule class in defthm
.
Technically we are ``overloading'' defthm
by using it in the
creation of :induction
rules because no theorem need be proved to
set up the heuristic link represented by an :induction
rule.
However, since defthm
is generally used to create rules and
rule-class objects are generally used to specify the exact form of
each rule, we maintain that convention and introduce the notion of
an :induction
rule. An :induction
rule can be created from any
lemma whatsoever.
General Form of an :induction Lemma or Corollary: TwhereGeneral Form of an :induction rule-class: (:induction :pattern pat-term :condition cond-term :scheme scheme-term)
pat-term
, cond-term
, and scheme-term
are all terms, pat-term
is the application of a function symbol, fn
, scheme-term
is the
application of a function symbol, rec-fn
, that suggests an
induction, and, finally, every free variable of cond-term
and
scheme-term
is a free variable of pat-term
. We actually check that
rec-fn
is either recursively defined -- so that it suggests the
induction that is intrinsic to its recursion -- or else that another
:induction
rule has been proved linking a call of rec-fn
as the
:pattern
to some scheme.
The induction rule created is used as follows. When an instance of
the :pattern
term occurs in a conjecture to be proved by induction
and the corresponding instance of the :condition
term is known to be
non-nil
(by type reasoning alone), the corresponding instance of the
:scheme
term is created and the rule ``suggests'' the induction, if
any, suggested by that term. If rec-fn
is recursive, then the
suggestion is the one that unwinds that recursion.
Consider, for example, the example given above,
(:induction :pattern (* 1/2 i) :condition (and (integerp i) (>= i 0)) :scheme (recursion-by-sub2 i)).In this example, we imagine that
recursion-by-sub2
is the
function:
(defun recursion-by-sub2 (i) (if (and (integerp i) (< 1 i)) (recursion-by-sub2 (- i 2)) t))Observe that this function recursively decomposes its integer argument by subtracting
2
from it repeatedly and stops when the
argument is 1
or less. The value of the function is irrelevant; it
is its induction scheme that concerns us. The induction scheme
suggested by (recursion-by-sub2 i)
is
(and (implies (not (and (integerp i) (< 1 i))) ; base case (:p i)) (implies (and (and (integerp i) (< 1 i)) ; induction step (:p (- i 2))) (:p i)))We can think of the base case as covering two situations. The first is when
i
is not an integer. The second is when the integer i
is 0
or 1
. In the base case we must prove (:p i)
without further
help. The induction step deals with those integer i
greater than 1
,
and inductively assumes the conjecture for i-2
while proving it for
i
. Let us call this scheme ``induction on i
by twos.''
Suppose the above :induction
rule has been added. Then an
occurrence of, say, (* 1/2 k)
in a conjecture to be proved by
induction would suggest, via this rule, an induction on k
by twos,
provided k
was known to be a nonnegative integer. This is because
the induction rule's :pattern
is matched in the conjecture, its
:condition
is satisfied, and the :scheme
suggested by the rule is
that derived from (recursion-by-sub2 k)
, which is induction on k
by
twos. Similarly, the term (* 1/2 (length l))
would suggest no
induction via this rule, even though the rule ``fires'' because it
creates the :scheme
(recursion-by-sub2 (length l))
which suggests no
inductions unwinding recursion-by-sub2
(since the controlling
argument of recursion-by-sub2
in this :scheme
is not a variable
symbol).
Continuing this example one step further illustrates the utility of
:induction
rules. We could define the function recursion-by-cddr
that suggests the induction scheme decomposing its consp
argument
two cdr
s at a time. We could then add the :induction
rule linking
(* 1/2 (length x))
to (recursion-by-cddr x)
and arrange for
(* 1/2 (length l))
to suggest induction on l
by cddr
.
Observe that :induction
rules require no proofs to be done. Such a
rule is merely a heuristic link between the :pattern
term, which may
occur in conjectures to be proved by induction, and the :scheme
term, from which an induction scheme may be derived. Hence, when an
:induction
rule-class is specified in a defthm
event, the theorem
proved is irrelevant. The easiest theorem to prove is, of course,
t
. Thus, we suggest that when an :induction
rule is to be created,
the following form be used:
(defthm name T :rule-classes ((:induction :pattern pat-term :condition cond-term :scheme scheme-term)))The name of the rule created is
(:induction name)
. When that rune
is disabled the heuristic link between pat-term
and scheme-term
is
broken.
Major Section: RULE-CLASSES
See rule-classes for a general discussion of rule classes and
how they are used to build rules from formulas. An example
:
corollary
formula from which a :linear
rule might be built is:
Example: (implies (and (eqlablep e) if inequality reasoning begins to (true-listp x)) consider how (length (member a b)) (>= (length x) compares to any other term, add to (length (member e x)))) set of known inequalities the fact that it is no larger than (length b), provided (eqlablep a) and (true-listp b) rewrite to tNote: OneGeneral Form: (and ... (implies (and ...hi...) (implies (and ...hk...) (and ... (rel lhs rhs) ...))) ...)
:linear
rule class object might create many linear
arithmetic rules from the :
corollary
formula. To create the rules,
we first flatten the and
and implies
structure of the formula,
transforming it into a conjunction of formulas, each of the form
(implies (and h1 ... hn) (rel lhs rhs))where no hypothesis is a conjunction and
rel
is one of the
inequality relations <
, <=
, =
, /=
, >
, or >=
. If necessary, the
hypothesis of such a conjunct may be vacuous.
We create a :linear
rule for each such conjunct, if possible, and
otherwise cause an error. Each rule has one or more ``trigger
terms'' which may be specified by the user using the :trigger-terms
field of the rule class or which may be defaulted to values chosen
by the system. We discuss the determination of trigger terms after
discussing how linear rules are used.
:linear
rules are used by a linear arithmetic decision procedure
during rewriting. We describe the procedure very roughly here.
Fundamental to the procedure is the notion of a linear polynomial
inequality. A ``linear polynomial'' is a sum of terms, each of
which is the product of a rational constant and an ``unknown.'' The
``unknown'' is permitted to be 1
simply to allow a term in the sum
to be constant. Thus, an example linear polynomial is
3*x + 7*a + 2
; here x
and a
are the (interesting) unknowns.
In our case, the unknowns need not be variable symbols. For
example, (length x)
might be used as an unknown in a linear
polynomial. Thus, another linear polynomial is 3*(length x) + 7*a
.
A ``linear polynomial inequality'' is an inequality (either <
or <=
)
relation between two linear polynomials. Certain linear polynomial
inequalities can be combined by cross-multiplication and addition to
permit the deduction of a third linear inequality polynomial with
fewer unknowns. If this deduced inequality is manifestly false, a
contradiction has been deduced from the assumed inequalities.
For example, suppose we have three assumptions:
p1: 3*x + 7*a < 4 p2: 3 < 2*x p3: 0 <= a.By cross-multiplying and adding the first two (that is, multiplying
p1
by 2
, p2
by 3
and adding the respective sides), we
deduce the intermediate result
p4: 6*x + 14*a + 9 < 8 + 6*xwhich, after cancellation, is
p4: 14*a + 1 < 0.If we then cross-multiply and add
p3
to p4
, we get
p5: 1 <= 0,a contradiction. Thus, we have proved that
p1
and p2
imply the
negation of p3
.
All of the unknowns of a polynomial must be eliminated by
cancellation in order to produce a constant inequality. We can
choose to eliminate the unknowns in any order. We eliminate them in
term-order, largest unknowns first. (See term-order.) Now
suppose that this procedure does not produce a contradiction but
instead yields a set of nontrivial inequalities. A contradiction
might still be deduced if we could add to the set some additional
inequalities allowing additional cancellation. That is where
:linear
lemmas come in. When the set of inequalities has stabilized
under cross-multiplication and addition and no contradiction is
produced, we search the data base of :linear
rules for rules about
the unknowns that are candidates for cancellation (i.e., are the
largest unknowns in their respective inequalities).
If the trigger term of some :linear
rule can be instantiated to
yield a candidate for cancellation, we so instantiate that rule,
attempt to relieve the hypotheses with general-purpose rewriting,
and, if successful, convert the concluding inequality into a
polynomial inequality and add it to the set. This may let
cancellation continue. See the discussion of ``Linear Arithmetic
Rules'' pp 242 of A Computational Logic Handbook for more details.
We now describe how the trigger terms are determined. Most of the
time, the trigger terms are not specified by the user and are
instead selected by the system. However, the user may specify the
terms by including an explicit :trigger-terms
field in the rule
class, e.g.,
General Form of a Linear Rule Class: (:LINEAR :COROLLARY formula :TRIGGER-TERMS (term1 ... termk))Each
termi
must be a term and must not be a variable, quoted
constant, lambda application, let-expression
or if-expression
. In
addition, each termi
must be such that if all the variables in the
term are instantiated and then the hypotheses of the corollary
formula are relieved (possibly instantiating additional free
variables), then all the variables in the concluding inequality are
instantiated. We generate a linear rule for each conjuctive branch
through the corollary and store each rule under each of the
specified triggers. Thus, if the corollary formula contains several
conjuncts, the variable restrictions on the termi
must hold for each
conjunct.
Note: Problems may arise if you explicitly store a linear lemma
under a trigger term that, when instantiated, is not the largest
unknown in the instantiated concluding inequality. Suppose for
example you store the linear rule (<= (fn i j) (/ i (* j j)))
under
the trigger term (fn i j)
. Then when the system ``needs'' an
inequality about (fn a b)
, i.e., because (fn a b)
is the largest
unknown in some inequality in the set of assumed inequalities, it
will appeal to the rule and deduce (<= (fn a b) (/ a (* b b)))
.
However, the largest unknown in this inequality is (/ a (* b b))
and
hence this inequality will not be used until that term is eliminated
by cancellation with some other inequality. It is generally best to
specify as a trigger term one of the ``maximal'' terms of the
polynomial, as described below.
If :trigger-terms
is omitted the system computes a set of trigger
terms. Each conjunct of the corollary formula may be given a unique
set of triggers depending on the variables that occur in the
conjunct and the addends that occur in the concluding inequality.
In particular, the trigger terms for a conjunct is the list of all
``maximal addends'' in the concluding inequality.
The ``addends'' of (+ x y)
and (- x y)
are the union of the
addends of x
and y
. The addends of (- x)
and (* n x)
,
where n
is a numeric constant, is in all cases just {x}
. The
addends of an inequality are the union of the addends of the left-
and right-hand sides. The addends of any other term, x
, is
{x}
.
A term is maximal for a conjunct (implies hyps concl)
of the
corollary if (a) the term is a non-variable, non-quote, non-lambda
application, non-let
and non-if
expression, (b) the term contains
enough variables so that when they are instantiated and the
hypotheses are relieved then all the variables in concl
are
instantiated, and (c) any instantiation of the term is always at
least as ``large'' (see term-order for a description of the
order used) as the corresponding instantiations of any other addend
in concl
.
Major Section: RULE-CLASSES
See rule-classes for a general discussion of rule classes and
how they are used to build rules from formulas. An example
:
corollary
formula from which a :linear-alias
rule might be built
is as follows. (Of course, the functions fix
, plus
, diff
, and lessp
would have to be defined first.)
Example: (and (equal (fix (plus x y)) (plus x y)) (equal (fix (diff x y)) (diff x y)) (equal (fix (fix x)) (fix x)) (equal (lessp x y) (< (fix x) (fix y))) (equal (plus x y) (+ (fix x) (fix y))) (implies (not (lessp x y)) (equal (diff x y) (- (fix x) (fix y)))))
:linear-alias
rules are essentially a hack to allow the creation
of the "NQTHM"
package. We will be surprised if the casual
user of ACL2 ever finds a use for them. They are not documented
very thoroughly.
General Forms: (equal lhs rhs) (implies hyps (equal lhs rhs))where every variable that occurs in the rule must occur in the lhs. As for
:
rewrite
rules, we strip the conjuncts out of the
:
corollary
formula and generate one :linear-alias
rule for each.
Thus, it is possible to package together a complete set of
:linear-alias
rules under a single name.
When linear arithmetic is applied to a term that is not
``linearizable,'' i.e., a term that does not start with <
, +
, or -
,
it attempts to transduce the term to a linearizable one by applying
all known :linear-alias
rules as rewrite rules. This transduction
is performed by the simple rewrite function
rewrite-with-linear-aliases
. Like the standard ACL2 rewriter, the
linear alias rewriter applies linear alias rules exhaustively in the
sense that if a rule applies then the rhs
of the rule is recursively
rewritten. Unlike :
rewrite
rules, :linear-alias
rules are applied
from outside in (!) and hypotheses are not relieved but merely
collected.
For example, rewriting (lessp (diff (plus a b) c) (diff u v))
with
the rules generated from the conjunction above produces
(< (+ (+ (fix a) (fix b)) (- (fix c))) (+ (fix u) (- (fix v))))under the hypothesis that
(not (lessp (plus a b) c))
and
(not (lessp u v))
. (Actually, the +
and -
operators shown
above are really calls of binary-+
and unary--
.)
Think of :linear-alias
rules as transforming a term from one theory
to another, generating hypotheses in the first theory. Call the
initial theory ``nqthm'' and the final one ``acl2.'' One should
strive to create a set of rules that transform from the nqthm to the
acl2 theory while only generating hypotheses in the first theory.
The restriction is motivated by the desire not to surprise the nqthm
user by suddently introducing conditions that are not phrased in the
nqthm theory. The purpose of the nqthm package is to emulate nqthm,
so it simply represents a failure if acl2 suddenly produces a case
split, say, in which some non-nqthm term appears, such as
(rationalp (plus a b))
. To this end, we offer some advice below.
However, it is especially helpful to study how the outside-in
rewrite strategy uses the rules above to achieve this goal.
For example,
(lessp (plus a b) (diff u v))first steps to
(< (fix (plus a b)) (fix (diff u v))),unconditionally. But
plus
and diff
don't need fix
ing:
(< (plus a b) (diff u v)).Rewriting further finishes with
(< (+ (fix a) (fix b)) (+ (fix u) (- (fix v)))).Observe that had we rewritten inside-out the sequence would have been:
(lessp (+ (fix a) (fix b)) (+ (fix u) (- (fix v))))and then
(< (fix (+ (fix a) (fix b))) (fix (+ (fix u) (- (fix v))))).Note that we now need to know that
+
doesn't need fix
ing if its
arguments have been. But that is a more complicated rule than the
simple one that plus
doesn't need fix
ing. Perhaps a different set
of rules would work well for inside-out rewriting, but I just used
the ones that first leapt to mind. It is interesting to note that
linearization proceeds outside in too.
Here then is the advice. Rules should either transform nqthm terms
to nqthm terms, e.g., (fix (plus x y))
to (plus x y)
, or should
transform nqthm terms to acl2 terms, e.g., (lessp x y)
to
(< (fix x) (fix y))
. When the rhs
introduces an acl2 function
symbol, the symbol should be outside of all nqthm terms in the
rhs
. Hypotheses should be expressed in the nqthm theory. No
rule should rewrite an acl2 function symbol. If the transformation
involves ``fixers'' then each operator should have two rules about
it. The first, like (plus x y) = (+ (fix x) (fix y))
, explains
the nqthm function in terms of an acl2 function with fixed
arguments. The second, like (fix (plus x y)) = (plus x y)
eliminates the interior fixes generated by the first rule. These
conventions mean that the rewriter, by going from outside in, always
sees nqthm terms and instantiates the hypotheses of rules with nqthm
terms. To violate these conventions might produce a set of rules
that would cause a non-nqthm hypothesis to be generated.
Finally, observe that the rules probably will introduce a ``fixer'',
which is the function fix
in the NQTHM package. The fixer tends to
settle around the nonlinear leaves of the resultant linearizable
term. It is important to have a :
type-prescription
rule
establishing that the fixer produces a rational.
When a :
linear
arithmetic rule is proved its conclusion is
linearized to obtain the relevant polynomials. :linear-alias
rules
may kick in there; thus, a conclusion of the form
(not (lessp (length str) (strpos pat str)))
may be a legal
:
linear
rule. Note however that its linearization is
(not (< (fix (length str)) (fix (strpos pat str))))
. The maximal
term of this rule is (fix (strpos pat str))
. However, we do not
want to store such a linear rule under fix
because that would
result in all of the Nqthm proveall's linear rules being stored
under the same symbol. We therefore note when the maximal term does
not actually occur in the conclusion, is of the form (fix term)
, for
some unary function fix
, and term does occur in the conclusion. In
that case, we call fix
a ``fixer'' and we store a t
under the
property 'linear-alias-fixerp
. We then store the rule with the
maximal term term
instead of (fix term)
. Later, when we are
adding lemmas to the linear pot, we note that if we seek linear
lemmas about (fix term)
, where fix
is a fixer, we look for
lemmas both stored under the function symbol of term and stored
under the function symbol fix
. In all cases, the maximal term of
the rule contains as its top-most function symbol the symbol under
which the rule is stored. Thus, if we seek lemmas about
(fix (strpos pat str))
we will go to the property list of
strpos
and there find rules about (strpos pat str)
. We will
instantiate their conclusions and linearize them, producing
polynomials about (fix (strpos pat str))
. This raises a subtle
point. It could be that there is more than one fixer used in
connection with strpos
. Thus, some lemmas could linearize to
polynomials about (rat (strpos pat str))
. Therefore, when we
create the linear rule with (strpos pat str)
as its (fixed)
maximal term, we store in the :fixer
field of the rule the symbol
fix
to indicate that the linearization of this conclusion will
(probably) produce (fix (strpos pat str))
terms. When looking
for lemmas about (fix (strpos pat str))
we actually look for
lemmas about strpos
that have :fixer
fix
.
Here is an extended example of the use of linear aliasing. First, we define some nqthm-level functions:
(defun fix (x) (if (and (integerp x) (>= x 0)) x 0)) (defun lessp (x y) (< (fix x) (fix y))) (defun plus (x y) (+ (fix x) (fix y))) (defun diff (x y) (fix (- (fix x) (fix y))))Note that
(fix x)
gets a :
type-prescription
rule that it is a
nonnegative integer.
Now we prove a :linear-alias
rule, actually a package of the six
rules above.
(defthm nqthm-linear-arithmetic (and (equal (fix (plus x y)) (plus x y)) (equal (fix (diff x y)) (diff x y)) (equal (fix (fix x)) (fix x)) (equal (lessp x y) (< (fix x) (fix y))) (equal (plus x y) (+ (fix x) (fix y))) (implies (not (lessp x y)) (equal (diff x y) (- (fix x) (fix y))))) :rule-classes :linear-alias)To mimic Nqthm behavior, we need a
:
rewrite
rule to dispatch the
cases generated by the case split caused by linearizing
diff
-expressions. This is not, strictly speaking, part of linear
aliasing.
(defthm diff-0 (implies (lessp x y) (equal (diff x y) 0)))Now we disable the nqthm-level functions. Henceforth, we do not expect to see ACL2-level functions in proofs about these concepts.
(in-theory (disable fix lessp plus diff))Note that we can now prove such facts as
(thm (implies (and (lessp a b) (lessp b c)) (lessp a c))) (thm (implies (lessp a (diff b c)) (lessp a b)))and that the proofs appeal to linear arithmetic!
We can add new aliases (although they get new names). Note that we
have to (a) define the function, (b) prove the :linear-alias
rule --
probably enabling nqthm-level functions to do it, and (c) disable
the new function so it survives rewriting and is around for
linearization to de-alias.
(defun add1 (x) (+ (fix x) 1)) ; (a) (defthm add1-linear ; (b) - see below (and (equal (add1 x) (+ (fix x) 1)) (equal (fix (add1 x)) (add1 x))) :rule-classes :linear-alias :hints (("Goal" :in-theory (enable fix)))) (in-theory (disable add1)) ; (c) (thm (lessp x (add1 x)))Observe that in (b) we prove two rules about
add1
: the one that
relates it to +
and the one that removes the fix
es inserted by the
first rule.
Next we illustrate the use of :
linear
lemmas in the context of
linear aliasing. Consider the function:
(defun strpos (pat str) ; Find the position of pat in str. (cond ((atom str) 0) ((equal pat (car str)) 0) (t (add1 (strpos pat (cdr str))))))We can prove and store the following(defun strlen (str) (cond ((atom str) 0) (t (add1 (strlen (cdr str))))))
:
linear
lemma about strpos
.
(defthm strpos-length (not (lessp (strlen str) (strpos pat str))) :rule-classes :linear)This event is interesting for two reasons. The proof illustrates linear aliasing. But the fact that this event is stored as a
:
linear
lemma -- even though the conclusion is not about <
-- also
illustrates linear aliasing. In fact, this lemma is stored under
(strpos pat str)
with :fixer
fix
.We can illustrate the storage and retrieval of the lemma directly by proving:
(thm (not (< (fix (strlen str)) (fix (strpos pat str)))))But more commonly we will be interested in proofs that reach out for facts about
strpos
(implicitly fix
ing them for us) as in:
(thm (implies (lessp (strlen str) mx) (lessp (strpos pat str) mx))).This ends the extended example of linear aliasing lemmas.
:meta
rule (a hand-written simplifier)
Major Section: RULE-CLASSES
See rule-classes for a general discussion of rule classes and
how they are used to build rules from formulas. Meta rules extend
the ACL2 simplifier with hand-written code to transform certain
terms to equivalent ones. To add a meta rule, the :
corollary
formula must establish that the hand-written ``metafunction''
preserves the meaning of the transformed term.
Example :
corollary
formulas from which :meta
rules might be
built are:
Examples: (equal (ev x a) ; Modify the rewriter to use fn to (ev (fn x) a)) ; transform terms. The :trigger-fns ; of the :meta rule-class specify ; the top-most function symbols of ; those x that are candidates for ; this transformation.A non-(implies (and (pseudo-termp x) ; As above, but this illustrates (alistp a)) ; that without loss of generality we (equal (ev x a) ; may restrict x to be shaped like a (ev (fn x) a))) ; term and a to be an alist.
(implies (and (pseudo-termp x) ; As above (with or without the (alistp a) ; hypotheses on x and a) with the (ev (hyp-fn x) a)); additional restriction that the (equal (ev x a) ; meaning of (hyp-fn x) is true in (ev (fn x) a))) ; the current context. That is, the ; applicability of the transforma- ; tion may be dependent upon some ; computed hypotheses.
nil
list of function symbols must be supplied as the value
of the :trigger-fns
field in a :meta
rule class object.
General Forms: (implies (and (pseudo-termp x) ; this hyp is optional (alistp a) ; this hyp is optional (ev (hyp-fn x) a)) ; this hyp is optional (equiv (ev x a) (ev (fn x) a)))where
equiv
is a known equivalence relation, x
and a
are distinct variable names, and ev
is an evaluator function (see
below), and fn
is a function symbol, as is hyp-fn
when
provided. Both fn
and hyp-fn
should take a single argument.
If fn
and/or hyp-fn
are guarded, their guards should be
(implied by) pseudo-termp
. See pseudo-termp. We say the
theorem above is a ``metatheorem'' or ``metalemma'' and fn
is a
``metafunction'', and hyp-fn
is a ``hypothesis metafunction''.We defer discussion of the case in which there is a hypothesis metafunction and for now address the case in which the other two hypotheses are present.
In the discussion below, we refer to the argument, x
, of fn
and hyp-fn
as a ``term.'' When these metafunctions are executed
by the simplifier, they will be applied to (the quotations of)
terms. But during the proof of the metatheorem itself, x
may not
be the quotation of a term. If the pseudo-termp
hypothesis is
omitted, x
may be any object. Even with the pseudo-termp
hypothesis, x
may merely ``look like a term'' but use
non-function symbols or function symbols of incorrect arity. In any
case, the metatheorem is stronger than necessary to allow us to
apply the metafunctions to terms, as we do in the discussion below.
We return later to the question of proving the metatheorem.
Suppose the general form of the metatheorem above is proved with the
pseudo-termp
and alistp
hypotheses. Then when the simplifier
encounters a term, (h t1 ... tn)
, that begins with a function
symbol, h
, listed in :trigger-fns
, it applies the
metafunction, fn
, to the quotation of the term, i.e., it
evaluates (fn '(h t1 ... tn))
to obtain some result, which can be
written as 'val
. If 'val
is different from '(h t1 ... tn)
and val
is a term, then (h t1 ... tn)
is replaced by val
,
which is then passed along for further rewriting. Because the
metatheorem establishes the correctness of fn
for all terms (even
non-terms!), there is no restriction on which function symbols are
listed in the :trigger-fns
. Generally, of course, they should be
the symbols that head up the terms simplified by the metafunction
fn
. See term-table for how one obtains some assistance
towards guaranteeing that val
is indeed a term.
The ``evaluator'' function, ev
, is a function that can evaluate a
certain class of expressions, namely, all of those composed of
variables, constants, and applications of a fixed, finite set of
function symbols, g1
, ..., gk
. Generally speaking, the set of
function symbols handled by ev
is chosen to be exactly the
function symbols recognized and manipulated by the metafunctions
being introduced. For example, if fn
manipulates expressions in
which '
equal
and '
binary-append
occur as function
symbols, then ev
is generally specified to handle equal
and
binary-append
. The actual requirements on ev
become clear
when the metatheorem is proved. The standard way to introduce an
evaluator is to use the ACL2 macro defevaluator
, though this is
not strictly necessary. See defevaluator for details.
Why are we justified in using metafunctions this way? Suppose
(fn 'term1)
is 'term2
. What justifies replacing term1
by
term2
? The first step is to assert that term1
is
(ev 'term1 a)
, where a
is an alist that maps 'var
to
var
, for each variable var
in term1
. This step is
incorrect, because 'term1
may contain function symbols other than
the ones, g1
, ..., gk
, that ev
knows how to handle. But we
can grow ev
to a ``larger'' evaluator, ev*
, an evaluator for
all of the symbols that occur in term1
or term2
. We can prove
that ev*
satisfies the constraints on ev
. Hence, the
metatheorem holds for ev*
in place of ev
, by functional
instantiation. We can then carry out the proof of the
equivalence of term1
and term2
as follows: Fix a
to be
an alist that maps the quotations of the variables of term1
and
term2
to themselves. Then,
term1 = (ev* 'term1 a) ; (1) by construction of ev* and a = (ev* (fn 'term1) a) ; (2) by the metatheorem for ev* = (ev* 'term2 a) ; (3) by evaluation of fn = term2 ; (4) by construction of ev* and aNote that in line (2) above, where we appeal to the (functional instantiation of the) metatheorem, we can relieve its (optional)
pseudo-termp
and alistp
hypotheses by appealing to the facts
that term1
is a term and a
is an alist by construction.
Finally, we turn to the second case, in which there is a hypothesis
metafunction. In that case, consider as before what happens when
the simplifier encounters a term, (h t1 ... tn)
, where h
is
listed in :trigger-fns
. This time, after it applies fn
to
'(h t1 ... tn)
to obtain the quotation of some new term, 'val
,
it then applies the hypothesis metafunction, hyp-fn
. That is, it
evaluates (hyp-fn '(h t1 ... tn))
to obtain some result, which
can be written as 'hyp-val
. If hyp-val
is not in fact a term,
the metafunction is not used. Provided hyp-val
is a term, the
simplifier attempts to establish (by conventional backchaining) that
this term is non-nil
in the current context. If this attempt
fails, then the meta rule is not applied. Otherwise, (h t1...tn)
is replaced by val
as in the previous case (where there was no
hypothesis metafunction).
Why is it justified to make this extension to the case of hypothesis metafunctions? First, note that the rule
(implies (and (pseudo-termp x) (alistp a) (ev (hyp-fn x) a)) (equal (ev x a) (ev (fn x) a)))is logically equivalent to the rule
(implies (and (pseudo-termp x) (alistp a)) (equal (ev x a) (ev (new-fn x) a)))where
(new-fn x)
is defined to be
(list 'if (hyp-fn x) (fn x) x)
. (If we're careful, we realize
that this argument depends on making an extension of ev
to an
evaluator ev*
that handles if
and the functions manipulated by
hyp-fn
.) If we write 'term
for the quotation of the present
term, and if (hyp-fn 'term)
and (fn 'term)
are both terms, say
hyp1
and term1
, then by the previous argument we know it is
sound to rewrite term to (if hyp1 term1 term)
. But since we have
established in the current context that hyp1
is non-nil
, we
may simplify (if hyp1 term1 term)
to term1
, as desired.
We now discuss the role of the pseudo-termp
hypothesis.
(Pseudo-termp x)
checks that x
has the shape of a term.
Roughly speaking, it insures that x
is a symbol, a quoted
constant, or a true list consisting of a lambda
expression or
symbol followed by some pseudo-terms. Among the properties of terms
not checked by pseudo-termp
are that variable symbols never begin
with ampersand, lambda
expressions are closed, and function
symbols are applied to the correct number of arguments.
See pseudo-termp.
There are two possible roles for pseudo-termp
in the development
of a metatheorem: it may be used as the guard of the
metafunction and/or hypothesis metafunction and it may be used as a
hypothesis of the metatheorem. Generally speaking, the
pseudo-termp
hypothesis is included in a metatheorem only if it
makes it easier to prove. The choice is yours. (An extreme example
of this is when the metatheorem is invalid without the hypothesis!)
We therefore address ourselves the question: should a metafunction
have a pseudo-termp
guard? A pseudo-termp
guard for
a metafunction, in connection with other considerations described
below, improves the efficiency with which the metafunction is used
by the simplifier.
To make a metafunction maximally efficient you should (a) provide it
with a pseudo-termp
guard and exploit the guard when
possible in coding the body of the function
(see guards-and-evaluation, especially the section on
efficiency issues), (b) verify the guards of the metafunction
(see verify-guards), and (c) compile the metafunction
(see comp). When these three steps have been taken the simplifier
can evaluate (fn 'term1)
by running the compiled ``primary code''
(see guards-and-evaluation) for fn
directly in Common Lisp.
Before discussing efficiency issues further, let us review the for a
moment the general case in which we wish to evaluate (fn `obj)
for some :
logic
function. We must first ask whether the
guards of fn
have been verified. If not, we must evaluate
fn
by executing its logic definition. This effectively checks
the guards of every subroutine and so can be slow. If, on the
other hand, the guards of fn
have been verified, then we can
run the primary code for fn
, provided 'obj
satisfies the
guard of fn
. So we must next evaluate the guard of
fn
on 'obj
. If the guard is met, then we run the primary
code for fn
, otherwise we run the logic code.
Now in the case of a metafunction for which the three steps above
have been followed, we know the guard is (implied by)
pseudo-termp
and that it has been verified. Furthermore, we know
without checking that the guard is met (because term1
is a
term and hence 'term1
is a pseudo-termp
). Hence, we can use
the compiled primary code directly.
We strongly recommend that you compile your metafunctions, as well
as all their subroutines. Guard verification is also recommended.