Induction-coarse-v-fine-grained
advice on case explosion in complex inductions
Recursive functions suggest inductions based on the case analysis
in the function definition and the positions of recursive calls within that
case analysis. That analysis is done when termination is proved. Key parts
of that analysis are the notions of governing tests and ruling tests. By
using the so-called ``ruler-extenders'' feature of termination analysis you
can sometimes choose between ``coarse-grained'' and ``fine-grained''
inductions. The former have few cases but may provide multiple and sometimes
irrelevant induction hypotheses. The latter provide more cases, each
targeted at a given output, and tend to provide only induction hypotheses
needed for that output. See rulers for a discussion of
ruler-extenders.
Governing versus Ruling Tests
Roughly speaking, a test governs a recursive call if the test must
be true for control to reach that call. The ruling tests are a subset
of the governing tests and are the ones that determine the case analysis of
any suggested induction.
For example, if we are analyzing a definition of (fn x) and the body
is (if (not (consp x)) (h nil) (h (fn (cdr x)))) then (consp x)
governs the recursive call and it also rules the call. But if the body were
(h (if (not (consp x)) nil (fn (cdr x)))) then (consp x) governs the
recursive call but may not rule it. It would be considered a ruler if the
symbol h were on the ``ruler-extenders'' list — the list of
symbols that allow further collection of rulers from inside terms that begin
with those symbols.
The default setting of ruler-extenders is (if mv-list return-last).
That means that termination (and induction) analysis stops looking for ruling
tests when a term is encountered that does not begin with one of these listed
symbols. For example, under the default ruler-extenders, when the analysis
encounters a term that starts with a lambda expression, that term is
considered a tip of the tree. (Lambda expressions are relevant because
they are introduced by the macroexpansion of let and let*
expressions, which arise frequently in interesting recursive functions
because they facilitate code sharing by allowing the computation of
intermediate results used on multiple output branches.) When the analysis
encounters a term considered a tip, the rulers collected so far are used to
rule all the recursive calls in that tip. So at a tip, the analysis
collects all the recursive calls occurring in that tip and forms measure
conjectures and, eventually, induction hypotheses about them, using the
rulers as the hypotheses of those conjectures.
So the user can influence the determination of the rulers by setting the
``ruler-extenders'' in effect when the function is admitted. The global
default ruler-extenders can be set by the event set-ruler-extenders or
specified locally in a defun by the xargs keyword
:ruler-extenders. See rulers for the details.
Coarse versus Fine Induction Case Analysis
Most often, users think about ruler-extenders only in the context of
failed termination proofs: inspection of the measure conjectures
reveal that an important governing hypothesis was curiously omitted, the user
consults the documentation or appeals for help from other users, and is
reminded of ruler-extenders. One could argue that users forget about
ruler-extenders because the default setting for ruler-extenders is so often
the ``right'' one. But sometimes even a function that is successfully
admitted under the default setting would benefit from selecting a non-default
setting for ruler-extenders.
Sometimes you may feel that the prover is struggling with an induction
suggested by a function, even though the function that suggested the
induction is the one you expected to be selected, and for most simple
recursive functions the induction suggested is the ``perfect'' induction for
that function. This problem of the prover struggling to simplify the
induction steps might arise most commonly with functions that have let* bindings that contain if-expressions, some of which contain
recursive calls. For example, the following artificial example raises the
possibility of this problem.
(defun fn (x y)
(if (consp x)
(let* ((y1 (if (consp (car x))
(fn (car x) (cons 'car y))
y))
(y2 (if (consp (cdr x))
(fn (cdr x) (cons 'cdr y1))
y1)))
(cons y1 y2))
y))
Recall that let* expressions are translated into nested lambda
expressions, so the formal definition of fn is
(defun fn (x y)
(if (consp x)
((lambda (y1 x)
((lambda (y2 y1) (cons y1 y2))
(if (consp (cdr x))
(fn (cdr x) (cons 'cdr y1))
y1)
y1))
(if (consp (car x))
(fn (car x) (cons 'car y))
y)
x)
y))
So, the entire let* above, i.e., the formal term starting with the
first lambda expression, is a tip ruled only by (consp x). The
ifs within it are not considered even though their tests govern (but do
not rule) the subsequent recursions. The recursive calls in that tip are
(fn (car x) (cons 'car y))
and
(fn (cdr x)
(cons 'cdr
(if (consp (car x))
(fn (car x) (cons 'car y))
y))).
The default measure in this definition is (acl2-count x). The
measure conjectures are easy to prove because (consp x) implies
(acl2-count (car x)) and (acl2-count (cdr x)) are both less than
(acl2-count x). Since there are no more recursive calls in the
definition, fn is admitted.
Now consider how the prover would attempt to prove (p (fn x y)) by
the induction suggested by fn. The induction argument, shown below, has
one base case and one induction step. The induction step has two induction
hypotheses.
(and (implies (not (consp x)) (p (fn x y)))
(implies (and (consp x)
(p (fn (car x) (cons 'car y)))
(p (fn (cdr x)
(cons 'cdr
(if (consp (car x))
(fn (car x) (cons 'car y))
y)))))
(p (fn x y)))).
Note: Here and throughout this documentation topic we freely
rearrange the formulas we display to make it easier to compare different
induction schemes. For example, when the prover attempts to prove (p (fn
x y)) it lists the conjuncts above in the opposite order, i.e., the base
case occurs last. It can be hard to compare induction schemes from similar
functions because ACL2's methods of generating induction schemes does not
preserve the order of tips.
The prover attempts to reduce each of these two conjuncts to T by
simplification. The base case is generally simple. So consider the
induction step, the second implication above, which we abbreviate here as
(implies (and (consp x)
ind-hyp1
ind-hyp2)
ind-concl)
where, ind-hyp1, the first induction hypothesis, is (p (fn (car
x) (cons 'car y))), etc. See introduction-to-the-theorem-prover for
more details of how the rewriter works. When rewriting the formula above it
basically proceeds left-to-right, innermost first, keeping track of what can
be assumed given the context in which terms occur. Thus, by the time the
rewriter gets to the induction conclusion, ind-concl, the two
induction hypotheses will have been rewritten, making them easier to find and
identify as true if they occur in the rewriting of the conclusion.
But rewriting ind-hyp1 here will involve rewriting (fn (car
x) (cons 'car y)), which is completely irrelevant to half of the proof of
ind-concl. In particular, inspection of fn reveals that the
first recursive call of fn is only relevant if (consp (car x)) is
true. But we know nothing about (car x) in this induction step.
Furthermore, rewriting a call of a recursive function can be very expensive
since it requires (a) rewriting the actual expressions, then (b) rewriting
the body of the function under a substitution replacing the function's
formals by the rewritten actuals, and then (c) heuristically deciding whether
the rewritten body is preferable to the call, given the subterms occurring
elsewhere in the goal.
Another way to describe the unfortunate induction scheme above is that it
``coarse,'' because it provides a simple case analysis, which necessitates
lumping all the possibly relevant induction hypotheses into a single case (in
this particular example). When the induction conclusion opens, after the
induction hypotheses have all been simplified, it will cause further
splitting, (e.g., on (consp (car x))), and in some of those subgoals
some of the simplified induction hypotheses will be irrelevant (but still
burden the simplification process if the cases are not proved
immediately).
Depending on how hard it is to rewrite irrelevant induction hypotheses
— which depends not just on the function, like fn, suggesting the
induction but also on the conjecture being proved — it might be more
efficient to have a finer-grained case analysis so that the only induction
hypotheses in a given case are those on a given execution path.
A finer-grained induction scheme is generated, in the case of fn, by
arranging for lambda expressions not to stop the collection of
rulers. This can be done by adding the keyword :lambdas to the
ruler-extenders. The presence of that keyword means ``keep collecting rulers
as you dive into any term beginning with a lambda expression.''
(By the way, normally :ruler-extenders expects a list but if you specify
the single keyword :lambdas it denotes the list (if :lambdas mv-list
return-last).) Again, see rulers for details.
If we had defined fn as shown below we would get the finer induction
analysis shown subsequently.
(defun fn (x y)
(declare (xargs :ruler-extenders :lambdas))
(if (consp x)
(let* ((y1 (if (consp (car x))
(fn (car x) (cons 'car y))
y))
(y2 (if (consp (cdr x))
(fn (cdr x) (cons 'cdr y1))
y1)))
(cons y1 y2))
y))
Note that fn can be admitted without extending the rulers — we
have already demonstrated that. We are including :lambdas here
precisely to get a finer case analysis for induction.
The induction generated is shown below. We have slightly simplified the
induction by replacing (if a b c) by b when a
is among the hypotheses in the case analysis, and by replacing (if a
b c) by c when (not a) is among the
hypotheses. This simplification is actually not done by induction analysis
but by the first simplification.
(and (implies (not (consp x)) (p (fn x y)))
(implies (and (consp x)
(not (consp (car x)))
(not (consp (cdr x))))
(p (fn x y)))
(implies (and (consp x)
(not (consp (car x)))
(consp (cdr x))
(p (fn (cdr x) (cons 'cdr y))))
(p (fn x y)))
(implies (and (consp x)
(consp (car x))
(not (consp (cdr x)))
(p (fn (car x) (cons 'car y))))
(p (fn x y)))
(implies (and (consp x)
(consp (car x))
(consp (cdr x))
(p (fn (car x) (cons 'car y)))
(p (fn (cdr x)
(cons 'cdr (fn (car x) (cons 'car y))))))
(p (fn x y))))
Note that the case analysis here steers the (fn x y) in the
conclusion down exactly one path and the only hypotheses provided in each
induction step concern recursive calls on that path.
The finer case analysis gives us two base cases and four induction steps,
one of which has two induction hypotheses. We'll encounter terms of the form
(p (fn ...)) nine times in using this scheme, instead of just four such
terms in the first scheme we showed. It may seem surprising that this scheme
ever leads to faster proofs than the earlier one. But it can. Whether it
does depends, as mentioned above, on the complexity of rewriting the
recursive calls involved and the conjecture being proved.
Note: The ruler-extenders may include arbitrary function symbols
and the keyword :lambdas as above. But it can also be set to :all
which makes all the governors be rulers, i.e., collection of rulers dives
through all function symbols. We focus here on diving through lambda
expressions because they are by far the most common ``function
symbol'' (other than if) under which one finds additional tests and
recursive calls.
In the next section we'll discuss a theorem that makes use of these
insights and demonstrates that the finer scheme can lead to faster
proofs.
An Actual Example of Coarse versus Fine Induction Schemes
The ACL2 prettyprinter is implemented with the function ppr. It is
basically the composition of two functions, ppr1, which takes the object
to be printed and computes a ``ppr tuple'' describing how much to indent each
subexpression and where to break the lines, and ppr2, which actually
prints. To admit ppr as a logic mode function we have to prove
termination and verify the guards of all the functions involved. To verify
the guards, we have to prove that ppr1 returns a ppr-tuple-p. You
can see the definitions of ppr1 and ppr-tuple-p and their mutually
recursive peers by using the :pe command, e.g., :pe
ppr1.
We'll focus here on ppr1, which is mutually recursive with
ppr1-lst. To prove anything interesting about a function in a mutually
recursive clique you generally have to simultaneously prove analogous
theorems about every function in the clique. That is most often done by
defining a ``flagged'' function which can recur as any function in the clique
according to a flag. The community book books/tools/flag provides a
convenient way to do that. Since that book facilitates the introduction of
the flagged function, it allows the generated definition to be processed with a
user-supplied ruler-extenders.
A good demonstration of the ``coarse'' and ``fine'' induction schemes for
ppr1 can be found in the community book
books/demos/ppr1-experiments. We summarize the contents of that book
here.
The book proves that ppr1 returns a ppr-tuple-p and
ppr1-lst returns a ppr-tuple-lst-p. In fact, it does that proof
seven times, under coarse and fine inductions and variations on some hints.
The coarse induction is obtained by admitting a flagged version of
ppr1/ppr1-lst under the default ruler-extenders. The fine
induction is obtained by admitting a differently named flagged version of
ppr1/ppr1-lst with :ruler-extenders :lambdas.
The coarse induction, generated under the default ruler-extenders, has 76
cases. Six are base cases. The other 70 are induction steps with varying
numbers of induction hypotheses as given in the sketch below.
(76 ;; 76 cases in the induction scheme
(0 . 6) ;; no induction hyps in 6 cases, i.e., Base Cases
(1 . 8) ;; 1 induction hyp in 8 cases
(2 . 2) ;; 2 induction hyps in 2 cases
(3 . 16) ;; 3 induction hyps in 16 cases
(4 . 32) ;; 4 induction hyps in 32 cases
(8 . 4) ;; 8 induction hyps in 4 cases
(9 . 8)) ;; 9 induction hyps in 8 cases
The fine induction, generated under the :lambdas ruler-extension, can
be similarly sketched.
(256 ;; 256 cases in the induction scheme
(0 . 6) ;; no induction hyps in 6 cases, i.e., Base Cases
(1 . 8) ;; 1 induction hyp in 8 cases
(2 . 82) ;; 2 induction hyps in 82 cases
(3 . 80) ;; 3 induction hyps in 80 cases
(4 . 80)) ;; 4 induction hyps in 80 cases
The induction generated under the :all ruler-extensions is identical
to the fine induction for ppr1 and ppr1-lst. (No governing
ifs are hidden under function calls other than if and lambda
expressions in those two functions.)
Regardless of which induction scheme we use, the proof that ppr1
returns a ppr-tuple-p and that ppr1-lst returns a
ppr1-tuple-lst-p fails without a certain hint: we have to tell the
prover to expand the calls of ppr1 and ppr1-lst in the conclusions
of the induction steps. If you try to prove the theorem without the hint the
first checkpoint makes clear that you need the hint. (This is not an
uncommon problem in inductive proofs about mutually recursive functions.)
But in the most basic like-for-like comparison of the successful proof of the
theorem by each induction scheme augmented by an :expand hint, the
coarse induction takes 2,165 seconds and the fine induction takes 65 seconds.
See scenarios 1 and 3 in the next section.
Comparing Several Optimizations
After noticing the performance differences between the coarse and fine
inductions we spent some time trying to further optimize the proof. We
compared seven different combinations of approaches. We discuss them here.
See the books/demos/ppr1-experiments for the details of each hint, etc.
The times reported below were originally recovered from the .cert.out
file after certification of the book in September, 2023, using the
development copy of ACL2 slated to become Version 8.6, running in CCL on a
Macbook Pro. Inspect the .cert.out file for more recent results.
As mentioned previously, a common situation with inductive proofs about
complicated mutually recursive functions is that the calls in the conclusion
aren't always automatically opened by ACL2's heuristics. So it is not
unusual in such proofs to provide a hint that explicitly expands the
ppr1 and ppr1-lst calls in the conclusion. In these experiments we
provide that hint two ways, either as part of fairly sophisticated computed
hint or with an :expand hint on "Goal". If you try to prove this
theorem with no hint it fails.
We will also be experimenting with the enabled status of ppr1 and
ppr1-lst.
The seven scenarios are specified by saying which induction scheme is
used, whether ppr1 and ppr1-lst are enabled or disabled by default,
and what hints are provided. There are three basic hints to chose from.
In addition, we sometimes avail ourselves of special-purpose lemmas.
- NIL lem — rewrites (ppr1-lst nil ...) to nil.
- MAX hack — lemmas about MAX allowing us to disable MAX
We try seven different combinations, numbered 1-7, but listed below
in descending order of proof times.
n induction status hints proof time
1 coarse enabled :expand 2165.19
2 coarse disabled computed 207.49
6 fine disabled computed 82.29
3 fine enabled :expand 65.00
4 fine disabled computed + :expand 62.86
5 fine disabled computed' + :expand + NIL lem 61.58
7 fine disabled computed + :expand + MAX hack 55.14
Note that scenarios 1 and 3 are a direct comparison of coarse and fine
induction with exactly the same hint. The coarse induction took 2165.19
seconds and the fine induction took 65.00, despite the fact that the coarse
induction had 76 cases to deal with and the fine induction had 256. The most
likely reason the coarse induction performed poorly is that there were 8
induction steps that had 9 induction hypotheses each, even though no case
ever actually required more than 4 induction hypotheses.
Scenario 2 shows that much time is saved by disabling ppr1 and
ppr1-lst and expanding them (when stable) with the computed hint. Note
that since the computed hint first expands them in the conclusion and lets
the resulting subgoals stabilize before enabling ppr1 and ppr1-lst,
the calls of ppr1 and ppr1-lst in the induction hypotheses do not
expand until the relevant case analysis is exposed by expanding the
conclusion.
Scenarios 4, 5, 6, and 7, attempt to improve upon the time seen in
scenario 3. We see that the best performance in this particular problem is
to use the fine induction case analysis, keep the relevant recursive
functions disabled by default, use an :expand hint to open them in the
conclusion but also have a computed hint that expands them in the conclusions
if a stable subgoal arises, and only enable those functions if they're still
in the subsequently stable subgoals. The MAX hack saves another 10% by
avoiding case splits caused by the many occurrences of MAX in
ppr1.
However, it should be noted that the major source of improvement is the
use of the fine induction scheme. The fact that there is only a 10% further
improvement achieved by the various other hints suggests it may not be worth
the effort! Coding the computed hint took time and thought, compared to just
using an :expand hint. And it was a lot easier to leave ppr1 and
ppr1-lst enabled and let ACL2 decide when to expand them than it was to
disable them and control their expansion by hints and lemmas. One take-home
lesson for us was that ACL2's heuristics for opening recursive functions are
pretty good!
Ignoring scenario 7 — where the additional improvement came from a
completely different source, namely, avoiding the expansion of MAX
expressions — the difference between the easiest thing to do (scenario
3) and the fastest method that focused on manually controlling ppr1 and
ppr1-lst (scenario 5) was less than 4 seconds. So, as usual with all
kinds of performance optimization, don't get sucked down the rabbit hole!
Take the easy wins and get on with the rest of your project!
Conclusion
Of course, whether fine induction schemes will improve other proofs just
depends on how many irrelevant induction hypotheses are present and how
complicated it is to simplify the terms in the theorem. If you are proving
something about a recursive function that contains let* expressions in
which some of the bindings conditionally make recursive calls, be alert to
the possibility that adjusting ruler-extenders to include :lambdas may
give better inductive performance. If you witness the prover engaged in
fairly deep case splits, even as it seems always to prove the resulting
cases, you might look for ways to get a finer induction case analysis.