Search-engine friendly clone of the ACL2 documentation.

Operational-semantics

Operational-semantics-1__simple-example

M1: definition, rules, clocks, proofs

In this topic we explain, by example, the most common way to formalize a computing machine in ACL2 and then reason about it. The machine we have in mind will be called “M1” and is a “toy” version of the Java Virtual Maching or “JVM.” More precisely, it is a simple stack machine having a fixed number of registers, hereafter called “local variables,” and an execute-only program memory. There will only be eight instructions. We will then write and verify a factorial program for it and mention many more M1 programs that have been verified — and which we urge you to solve as practice problems. Despite its simplicity, M1 is equivalent to a Turing machine and, in fact, that fact is among the theorems proved about M1. Full references are given when we survey the M1 results available.

(Historical Aside: What we're calling M1 was called ``Small-Machine'' in Nqthm. See bib::bm96 and the methodology described here was essentially fully developed before ACL2, Java, or the JVM came along.)

M1 does not support bytecode verification, method invocation (procedure call) and return, data objects other than ACL2's unbounded numbers, threads, exceptions, and many other features of modern machines and languages. However, M1 is an excellent place to start when learning how to formalize a machine and to prove theorems about it. Furthermore, it is the starting place of a series of machine models in the JVM family that we explore more fully at the end of this documentation topic.

You can find the definition of M1 and all of the work done with it on the ACL2 directory books/models/jvm/m1. It might be easiest to fire up your ACL2 system and do this.

(include-book "models/jvm/m1/m1" :dir :system)
(in-package "M1")

Then, to see the definition of any symbol mentioned below you could just issue the :pe command. For example, to see the definition of the function execute-ILOAD, aka execute-iload, you could type :pe execute-iload to the interactive prompt in your ACL2 session. This doc topic will not exhibit all the functions but will give examples of each “kind” of function involved in M1.

Organization of This Topic

Setting up a Symbol Package
The Definition of M1
Programming M1
“Teaching” the Prover How to Control M1

Arithmetic
Resolving “Reads” and “Writes”
Keeping “Abstractions” Abstract
Controlling Case Explosion due to Step
Clock Expressions
Expanding m1

Playing with Program *pi*: Execution, Symbolic Execution, and Proof of Correctness

The Clock for Factorial
Computing with M1
Symbolic Execution of M1 Code
Proving Theorems about M1 Programs

More M1 Programs and Proofs
More Elaborate Models of the JVM: From M1 to M6

Setting up a Symbol Package

All of the functions involved in the definition of M1 are in a new symbol package named "M1"

(defpkg "M1"
  (set-difference-eq
    (union-eq *acl2-exports*
              *common-lisp-symbols-from-main-lisp-package*)
    '(push pop pc program step nth update-nth nth-update-nth)))

Note that we put all the usual ACL2 symbols and Common Lisp symbols in the new package, except for a few whose names clash with names we want to define for use in our model.

See the ACL2 file books/models/jvm/m1/m1.acl2.

The Definition of M1

See the ACL2 file books/models/jvm/m1/m1.lisp for all of the functions mentioned below.

The state, s, of M1 will be given by a list of four elements, accessed with the functions named below.

pc — a natural number giving the position in (program s) of the next instruction
locals — a list containing the values, by position, of the local variables of the program
stack — a list representing a stack of intermediate results, the car being the topmost value
program — a list of instructions (see below)

States will be constructed with

(defun make-state (pc locals stack program)
   (cons pc
         (cons locals
               (cons stack
                     (cons program
                           nil)))))

Thus, (make-state pc locals stack program) is just (list pc locals stack program). The state accessors just select the appropriate elements of a state, e.g., (stack (make-state pc locals stack program)) is just stack.

To “abstractly” treat a list as a stack we define

(defun push (x y) (cons x y))   ; return a new stack with x on top of stack y
(defun top (stack) (car stack)) ; return topmost element of stack
(defun pop (stack) (cdr stack)) ; remove topmost element and return that stack

To access and update the elements of a list by 0-based positions we define

(defun nth (n list)
  (if (zp n)
      (car list)
    (nth (- n 1) (cdr list))))

(defun update-nth (n v list)
  (if (zp n)
      (cons v (cdr list))
    (cons (car list)
          (update-nth (- n 1) v (cdr list)))))

Thus, (nth 3 '(10 20 30 40 50)) is 40 and (update-nth 3 45 '(10 20 30 40 50)) is (10 20 30 45 50).

The eight M1 instructions are

(ICONST i) — push i onto the stack and advance the pc by 1
(ILOAD i) — push the value of the ith local onto the stack and advance the pc by 1
(ISTORE i) — pop the topmost value from the stack, store it as the value of the ith local, and advance the pc by 1
(IADD) — pop the topmost value, u, and value just under it, v, from the stack, push (+ v u) onto the stack, and advance the pc by 1
(ISUB) — pop the topmost value, u, and value just under it, v, from the stack, push (- v u) onto the stack, and advance the pc by 1
(IMUL) — pop the topmost value, u, and value just under it, v, from the stack, push (* v u) onto the stack, and advance the pc by 1
(GOTO i) — advance the pc by i
(IFEQ i) — pop the topmost value from the stack and if it is 0, then advance the pc by i and otherwise advance the pc by 1

In the case of both GOTO and IFEQ, the pc may be “advanced” by a negative amount.

To access the components of an instruction we define

(defun op-code (inst) (nth 0 inst))  ; return the op-code of instruction inst
(defun arg1 (inst) (nth 1 inst))     ; return the operand of instruction inst

For each instruction we define the function that takes the instruction and a state and returns the next state. For example,

(defun execute-ICONST (inst s)
  (make-state (+ 1 (pc s))
              (locals s)
              (push (arg1 inst) (stack s))
              (program s)))

(defun execute-ILOAD (inst s)
  (make-state (+ 1 (pc s))
              (locals s)
              (push (nth (arg1 inst)
                         (locals s))
                    (stack s))
              (program s)))

(defun execute-ISTORE (inst s)
  (make-state (+ 1 (pc s))
              (update-nth (arg1 inst) (top (stack s)) (locals s))
              (pop (stack s))
              (program s)))

(defun execute-IADD (inst s)
  (declare (ignore inst))
  (make-state (+ 1 (pc s))
              (locals s)
              (push (+ (top (pop (stack s)))
                       (top (stack s)))
                    (pop (pop (stack s))))
              (program s)))

(defun execute-IFEQ (inst s)
  (make-state (if (equal (top (stack s)) 0)
                  (+ (arg1 inst) (pc s))
                (+ 1 (pc s)))
              (locals s)
              (pop (stack s))
              (program s)))

The other instructions are analogous. Many users define an ACL2 macro, typically named modify that takes a machine state and some keyword argument and builds a new state from the given one, with new values for the keys specified with keywords. So for example, in that idiom one might replace the make-state call in execute-IADD by

(modify s :pc (+ 1 (pc s))
          :stack (push (+ (top (pop (stack s)))
                          (top (stack s)))
                       (pop (pop (stack s)))))

omitting any mention of locals and program because they are not changed. But for beginners we prefer the make-state idiom because it is more explicit.

We wrap all those “execute” functions up into a big-switch.

(defun do-inst (inst s)         ; ``do'' instruction inst to state s
  (if (equal (op-code inst) 'ILOAD)
      (execute-ILOAD  inst s)
      (if (equal (op-code inst) 'ICONST)
          (execute-ICONST  inst s)
          (if (equal (op-code inst) 'IADD)
              (execute-IADD   inst s)
              (if (equal (op-code inst) 'ISUB)
                  (execute-ISUB   inst s)
                  (if (equal (op-code inst) 'IMUL)
                      (execute-IMUL   inst s)
                      (if (equal (op-code inst) 'ISTORE)
                          (execute-ISTORE  inst s)
                          (if (equal (op-code inst) 'GOTO)
                              (execute-GOTO   inst s)
                              (if (equal (op-code inst) 'IFEQ)
                                  (execute-IFEQ   inst s)
                                  s)))))))))

Observe that if do-inst is called on an unknown instruction it returns the state unchanged. Thus (do-inst '(HALT) s) is just s.

To access the next instruction in a state we define

(defun next-inst (s)
  (nth (pc s) (program s)))

In some of our theorems about m1 it is convenient to have a way to say “the machine has reached the (HALT) instruction, i.e., the pc points to (HALT),” so we define

(defun haltedp (s)
  (equal (next-inst s) '(HALT)))

To “step” the machine just once we define

(defun step (s)
  (do-inst (next-inst s) s))

Finally we define m1 to step n times.

(defun m1 (s n)
  (if (zp n)
      s
      (m1 (step s) (- n 1))))

In a model like this one, where we control the length of the run by a natural-number step count, we often call the second argument of m1 the “clock.” One might think of it as counting “cycles” but not “run time.” In some models the “clock” might more reasonably called a “schedule” (specifying which process is to step next), or “inputs” (specifying what signals appear on certain pins in the next cycle), or “oracle” (specifying “random” choices).

Programming M1

Below we exhibit an M1 program for computing factorial. For convenience we define a constant with this program as its value so we can refer to it. We name the constant “pi” for “program”. In the comments to the right of the instructions we show the position (i.e., the corresponding pc) of the instruction and pseudocode for the nearby snippet of code. In this program we will have just two local variables, called “n” and “ans” in the pseudocode, in positions 0 and 1 respectiely of the locals. We'll compute (fact n) and leave the result in ans and on top of the stack.

(defconst *pi*
                ; pc    pseudo-code
  '((iconst 1)  ;  0                    [Block 1]
    (istore 1)  ;  1    ans := 1;

    (iload 0)   ;  2  loop:             [Block 2]
    (ifeq 10)   ;  3    if n=0 then goto exit (i.e., pc+10);

    (iload 1)   ;  4                    [Block 3]
    (iload 0)   ;  5
    (imul)      ;  6
    (istore 1)  ;  7    ans := ans * n;

    (iload 0)   ;  8                    [Block 4]
    (iconst 1)  ;  9
    (isub)      ; 10
    (istore 0)  ; 11    n := n - 1;

    (goto -10)  ; 12    goto loop;      [Block 5]

    (iload 1)   ; 13 exit:              [Block 6]
    (halt)))    ; 14   return ans;

We've put blank lines in the display to break the program into blocks, which we've numbered. We'll describe each block's effect now, just to drive home how m1 behaves. We will use these blocks in some examples later.

Block 1 is the “compilation” of “ans := 1;” The (iconst 1) pushes 1 onto the stack and the (istore 1) pops it off into local 1, which we're calling “ans”.

We have labeled the top of Block 2 “loop.” Although it is not syntactically obvious in this assembly-like code, subsequent code will jump back to pc 2.

Block 2 is the compilation of “if n=0, then goto exit,” where we labeled pc 13 “exit.” In particular, the (iload 0) at pc 2 pushes the value of “n”) onto the stack. The instruction at pc 3, (ifeq 10), pops it off and tests it against 0. If n=0, the ifeq increments the pc by 10, transferring control to pc 13.

Block 3 is the compilation of “ans := ans * n;”. Walk that code segment to understand.

Block 4 is the compilation of “n := n - 1;”.

Block 5 jumps back to pc 2 (i.e., loop). In particular, the (goto -10) at pc 12 adds -10 to the pc.

Block 6 is the compilation of “return ans.” It just pushes local 1 onto the stack and halts, leaving the pc at 14.

If all we wanted to do with m1 is run programs on it, we're done! M1 as defined above can be run on concrete input to produce concrete results.

For example, to compute factorial of 6 we would start with this state:

(make-state 0           ; pc - first instruction
            (list 6 0)  ; locals - n = 6, ans = 0
            nil         ; stack - empty
            *pi*)       ; program

and run it with m1. Of course, we need to specify the “clock,” the number of steps we want to take. In the absence of any better idea, we could just try 1000 and see if the final state is at pc 14, which is the (HALT).

M1 !>(m1 (make-state 0          ; pc
                     (list 6 0) ; locals
                     nil        ; stack
                     *pi*)      ; program
         1000)
(14 (0 720)                     ; pc and locals
    (720)                       ; stack
    ((ICONST 1)                 ; program
     (ISTORE 1)
     (ILOAD 0)
     (IFEQ 10)
     (ILOAD 1)
     (ILOAD 0)
     (IMUL)
     (ISTORE 1)
     (ILOAD 0)
     (ICONST 1)
     (ISUB)
     (ISTORE 0)
     (GOTO -10)
     (ILOAD 1)
     (HALT)))

Notice that the machine reached the (HALT) at pc 14 and 720 is on top of the stack. We also may note that local 0 has been zeroed out and local 1 contains 720.

We will deal with the clock situation more carefully later. It should be noted that our definition of m1 actually executes 1000 steps if that's what the clock says. In particular, the HALT instruction is a no-op, making no state change, and m1 just continues to bang away while sitting on that instruction until the clock runs out. We could, of course, define m1 differently, so as to return s whenever when (step s) = s. But we'll keep m1 simple.

“Teaching” the Prover How to Control M1

We'll now turn our attention to configuring ACL2 to prove things about M1 programs. This basically consists of proving lemmas to be used as :rewrite rules and, sometimes, disabling functions so that their calls don't expand automatically. We'll explain the motivations of our configuration as we go, but true understanding of the motivation won't come until we prove an M1 program correct. Be patient.

This section only exhibits some of the rules we introduce. See the ACL2 file books/models/jvm/m1/m1.lisp for all of the events. By the way, the sequence in which these definitions and lemmas appear below is not identical to the sequence in the m1.lisp file, but they're all there. In telling the story we just found the sequence below a little more natural.

Arithmetic

To be able to reason about simple arithmetic, we include one of the standard arithmetic books.

(include-book "arithmetic-5/top" :dir :system)

Resolving “Reads” and “Writes”

The local variables are read and written positionally using nth and update-nth. It is helpful to have the following lemmas to resolve reads after writes and eliminate shadowed writes. (These lemmas are not necessary in most of our m1 proofs because the locals are generally expressed as “semi-concrete” list (i.e., a cons tree with symbolic expressions in the car positions) of a fixed length, like (list a b c) and reads and writes always address an explicitly specified position, like 2, so even after a sequence of m1 instructions storing values, the symbolic expression of the locals will be a fixed length semi-concrete list. For example (update-nth 2 (+ u v) (list a b c d)) is automatically rewritten to (list a b (+ u v) d) just by the definition of update-nth. But in more general situations, where the position of the local variable is not an explicit constant or the symbolic expression of the current values of the locals in the state is not an explicit cons tree, the lemmas below are crucial.

(defthm nth-add1!
  (implies (natp n)
           (equal (nth (+ 1 n) list)
                  (nth n (cdr list)))))

(defthm nth-update-nth
  (implies (and (natp i) (natp j))
           (equal (nth i (update-nth j v list))
                  (if (equal i j)
                      v
                    (nth i list)))))

(defthm update-nth-update-nth-1
  (implies (and (natp i) (natp j) (not (equal i j)))
           (equal (update-nth i v (update-nth j w list))
                  (update-nth j w (update-nth i v list))))
  :rule-classes ((:rewrite :loop-stopper ((i j update-nth)))))

(defthm update-nth-update-nth-2
  (equal (update-nth i v (update-nth i w list))
         (update-nth i v list)))

Keeping “Abstractions” Abstract

To allow us to reason about “abstract” structures represented by conses, without expanding their definitions, we prove such lemmas as

(defthm stacks
  (and (equal (top (push x s)) x)
       (equal (pop (push x s)) s)

; These next two are needed because some push expressions evaluate to
; list constants, e.g., (push 1 (push 2 nil)) becomes '(1 2) and '(1
; 2) pattern-matches with (cons x s) but not with (push x s).

       (equal (top (cons x s)) x)
       (equal (pop (cons x s)) s)))

We prove the analogous rules about state accessors and make-state, e.g., (equal (pc (make-state pc locals stack program)) pc). Then we disable all the defined functions just mentioned so they never expand.

Controlling Case Explosion due to Step

Next we introduce a crucial rule for controlling the expansion of step. The term (step s) can always be expanded into that big-switch that considers all the possible op-codes. That is generally a disaster. Think of what would happen if (step (step s)) were automatically expanded when we have no information about s: the inner step would introduce a 9-way case split and the outer one would turn that into an 81-way case split considering all of the possiblities for the first two instructions. We want to expand step only when we know something definite about the instruction that is to be executed. So we prove this logically trivial theorem.

(defthm step-opener
  (implies (consp (next-inst s))
           (equal (step s)
                  (do-inst (next-inst s) s))))

The conclusion is just the definition of step! So the hypothesis is completely unnecessary from a logical perspective. But operationally, if this rule has been proved and then step is disabled, the rewriter will expand (step s) only if (next-inst s) can be proved to satisfy consp. The most common way for (next-inst s) to be a consp is when both the pc and the program in state s are quoted constants and program is a well-formed program. But we don't want to require exactly that because it is too restrictive. For example, in machine models supporting subroutine calls, the typical correctness theorem for a subroutine says very little about the entire ``program space'' but deals with an invoke- or jsr-type instruction to a place where the code for the subroutine is found. (If you inspect books/models/jvm/m2/examples.lisp and look at example4 you will see such a correctness theorem.) As models get more complex you may have to adjust step-opener accordingly, though this simple version is surprisingly effective.

Having established step-opener, we disable step so that the only way it ever expands is when the next-inst is provably a consp.

(in-theory (disable step))

Symbolic Execution Example 1

To illustrate the rules just described, consider the symbolic expression

(step
  (step
    (make-state 0
                (list n ans)
                stack
                *pi*    ; '((ICONST 1)  ;  0             [Block 1]
                        ;   (ISTORE 1)  ;  1   ans := 1;
                        ;   ...         ;  2
                        ;   ...)        ; ...
                )))

Note that the outer step cannot be expanded because step is disabled and we can't determine the next-instr of the (unsimplified) inner step.

But step-opener fires on the inner step because next-inst of the above make-state just rewrites to '(ICONST 1) using the definitions of pc, program, and nth. So the term above is rewritten to

(step
  (make-state 1
              (list n ans)
              (push 1 stack)
              *pi*    ; '((ICONST 1)    ;  0             [Block 1]
                      ;   (ISTORE 1)    ;  1   ans := 1;
                      ;   ...           ;  2
                      ;  ...)           ; ...
              ))

using step-opener, next-inst, do-inst, execute-iconst, nth, and update-nth. Execute-iconst just advanced the pc from 0 to 1 and symbolically pushed a 1 into the stack.

But now step-opener can fire on that outer step because we see that the next-inst of this state is (ISTORE 1). So the term rewrites to

(make-state 2
            (list n 1)
            stack
            *pi*    ; '((ICONST 1)      ;  0             [Block 1]
                    ;   (ISTORE 1)      ;  1   ans := 1;
                    ;   ...             ;  2
                    ;   ...)            ; ...
            )

We've just symbolically executed the first two instructions of *pi* on a symbolic state containing two unspecified locals and an unspecified stack. This simplification is completely automatic.

Symbolic Execution Example 2

We'll show one more symbolic execution of a snippet from program *pi*. This time, let's start at top of the loop in *pi*, pc 2, and run down through the (GOTO -10) where the program jumps back to the loop. That takes 11 steps. Furthermore, assume that n (local 0) is a non-0 natural number and ans (local 1) is a natural number.

(m1 (make-state 2 (list n ans) stack *pi*) 11)
=
(step
 (step
  (step
   (step
    (step
     (step
      (step
       (step
        (step
         (step
          (step
           (make-state 2
                       (list n ans)
                       stack
                       *pi*    ; '(...        ;  ...
                               ;   (ILOAD 0)  ;  2 loop: [Block 2]
                               ;   (IFEQ 10)  ;  3
                               ;   (ILOAD 1)  ;  4       [Block 3]
                               ;   (ILOAD 0)  ;  5
                               ;   (IMUL)     ;  6
                               ;   (ISTORE 1) ;  7
                               ;   (ILOAD 0)  ;  8       [Block 4]
                               ;   (ICONST 1) ;  9
                               ;   (ISUB)     ; 10
                               ;   (ISTORE 0) ; 11
                               ;   (GOTO -10) ; 12       [Block 5]
                               ;   ...)
                       ))))))))))))

(Exceedingly deep nests of steps can cause stack overflow in the ACL2 rewriter. We discuss this problem the presentation of the m1-opener lemma below.)

We won't show the step-by-step results of this symbolic execution (thank goodness!), but the first two instructions just move the pc from 2 to 4 because n is not 0. The next four replace ans by (* ans n), the next four replace n by (- n 1), and the last step sets the pc to the top of the loop at pc 2 again. Despite the pushing and popping on the stack, by the time we get back to the top of the loop, the stack is unchanged. The result is

(make-state 2
            (list (- n 1) (* ans n))
            stack
            *pi*               ; '(...        ;  ...
                               ;   (ILOAD 0)  ;  2 loop: [Block 2]
                               ;   (IFEQ 10)  ;  3
                               ;   (ILOAD 1)  ;  4       [Block 3]
                               ;   (ILOAD 0)  ;  5
                               ;   (IMUL)     ;  6
                               ;   (ISTORE 1) ;  7
                               ;   (ILOAD 0)  ;  8       [Block 4]
                               ;   (ICONST 1) ;  9
                               ;   (ISUB)     ; 10
                               ;   (ISTORE 0) ; 11
                               ;   (GOTO -10) ; 12       [Block 5]
                               ;   ...)
            )

We'll see this expression again later. What's important is that with the step rules we have in place, ACL2 will automatically do symbolic evaluation of concrete code sequences.

Clock Expressions

Many (most?) of our theorems about specific M1 programs will involve a clock expression which specifies exactly how many steps we want the machine to take. It is not as hard as you might think to create functions that compute this — provided you can prove that the clock function terminates, which is of course generally undecideable but frequently trivial. And, proofs via clock functions are not only equivalent to proofs by inductive assertions but that has been proved with ACL2 and tools are provided to convert back and forth between the two proof styles (see bib::rm04).

Clock functions for m1 are recursive functions defined that use arithmetic expressions to compute the lengths of straightline code segments. However, a key part of our strategy for controlling proofs is to use the structure of the clock function and its arithmetic expressions to decompose “long” runs of m1 into compositions of shorter runs. In order to do that, we must prevent the prover from rearranging our clocks! That is, (m1 s (+ i j)) will decompose differently than (m1 s (+ j i)), but the arithmetic library might rearrange the clock, e.g., by using the commutativity of addition. So instead of using + to express the the addition of two clocks we define clk+ to be +, but we'll disable its definition to protect clocks from arithmetic reasoning. We will arrange for clk+ to take more than just two arguments and will reveal that it is associative.

(defun binary-clk+ (i j)
  (+ (nfix i) (nfix j)))

(defthm clk+-associative
  (equal (binary-clk+ (binary-clk+ i j) k)
         (binary-clk+ i (binary-clk+ j k))))

(defmacro clk+ (&rest args)
  (if (endp args)
      0
      (if (endp (cdr args))
          (car args)
          `(binary-clk+ ,(car args)
                         (clk+ ,@(cdr args))))))

Thus (clk+ a b c) is just an abbreviation for (binary-clk+ a (binary-clk+ b c)), and the prover will rewrite (clk+ (clk+ a b) (clk+ d e)) to the binary-clk+ version of (clk+ a b c d).

The key fact about clocks and m1 is captured in the following rewrite rule, which is sometimes called the “sequential execution” rule or the “semi-colon rule.”

(defthm m1-clk+
  (equal (m1 s (clk+ i j))
         (m1 (m1 s i) j)))

Having proved this, we disable binary-clk+ so that we have complete control over how long runs decompose.

Expanding m1

Of course, m1 must eventually expand. The following rule effectively makes the prover open m1 only when the clock is a natural number constant. (We say “effectively” because we assume that all clock expressions are phrased in terms of clk+.) The rule below exploits the fact that the prover sees 7, for example, as (+ 1 6).

(defthm m1-opener
  (and (equal (m1 s 0) s)
       (implies (natp i)
                (equal (m1 s (+ 1 i))
                       (m1 (step s) i)))))

We then disable m1.

Obscure Remark: Experiments with the codewalker tool (see the ACL2 book books/projects/codewalker/codewalker.lisp, specifically the discussion of “snorkling”) suggest that nests of steps more than several hundred deep (depending on the complexity of step) can cause stack overflow. One way to handle that is to modify m1-opener, above, using a syntaxp hypothesis so that it doesn't fire if i is, say, larger than 200. Then prove a variant for “large” natural numbers, i, that transforms (m1 s i) into p (m1 (m1 s 200) (- i 200)), using syntaxp to limit the application to s terms that are not calls of either m1 or step, and i terms that are quoted natural numbers. This variant can be derived from m1-clk+. (Note: a hint is necessary to keep our other rules from interfering with the obvious proof.) The effect of this more sophisticated variant of m1-opener is that an expression like (m1 s 900) is rewritten first to (m1 (m1 s 200) 700); the sophisticated rule won't fire on the outer m1 because it is applied to an m1 term, but the conventional m1-opener will fire on the inner (m1 s 200) because of the small clock. That will reduce the inner m1 to a semi-explicit state, and then another 200 steps will be taken from the 700, etc. We do not recommend doing this until you start to get stack overflows, but we point out the possibility to highlight the tricks you can use to control expansions.

But for the rest of this discussion we assume we just have the conventional m1-opener. The rules shown above have the effect of rewriting the following m1 call successively into the last term below.

(m1 (make-state 0 (list n ans) stack *pi*)
    (clk+ 2 (loop-clk n)))

=                             {by m1-clk+}

(m1 (m1 (make-state 0 (list n ans) stack *pi*)
        2)
    (loop-clk n))

=                             {by m1-opener}

(m1 (step (step (make-state 0 (list n ans) stack *pi*)))
    (loop-clk n))

=                             {by step-opener}

(m1 (make-state 2
                (list n 1)
                stack
                *pi*)
    (loop-clk n))

The last simplification above is just the symbolic evaluation of the first two instructions of *pi*, as we shown Symbolic Execution Example 1.

We'll see the final m1 expression again later, when we prove *pi* computes the factorial function.

Playing with Program pi: Execution, Symbolic Execution, and Proof of Correctness

In this section use program *pi* to compute factorial on a concrete state, essentially using our formal model as a prototype of the envisioned machine. We show two examples, n = 6 and n = 1000. The latter requires several thousand m1 steps and allows us to time the execution speed of m1. We then show the theorem prover doing symbolic execution of a snippet of *pi* to help us “reverse engineer” the loop. Finally, we prove that when program *pi* is run on a natural number n in local 0, it halts and leaves (fact n) on top of the stack, where fact is defined in the classic recursive way. In fact, we prove something stronger and explain why.

The Clock for Factorial

If we start at pc 0, how many steps must we take to reach the halt? We take 2 steps to reach the instruction labeled loop at pc 2. We'll then travel around the loop some “unknown” number of times as a function of n before exiting and reaching the HALT. (Technically the iteration might also depend on ans but in this case it does not.) Let (loop-clk n) be the number of steps we take from pc 2 to the HALT. After defining loop-clk we can define the clock for *pi* to be

(defun clk (n)
  (clk+ 2
        (loop-clk n)))

To derive a definition for loop-clk just walk the code starting at pc 2. The code tests whether n is zero and if so jump to 13 where it puts ans on the stack and halt. Thus, if n is zero, we take 3 steps. If n is not zero (which takes 2 steps to determine), we multiply n into ans (at pcs 4-7, which takes 4 steps), decrement n (at pcs 8-11, which takes 4 steps), and then at pc 12 the code jumps back to loop (which takes 1 step). So after a total of 2+4+4+1 = 11 steps we're back at pc 2 having decremented n by 1. So here is how many steps we take from pc 2 to the halt:

(defun loop-clk (n)
  (if (zp n)
      3
      (clk+ 11
            (loop-clk (- n 1)))))

Question: how many steps does it take to compute 6!? Answer: (clk 6) = 71.

Computing with M1

If we fire up ACL2, include the m1 book, select the "M1" symbol package, and define *pi*, loop-clk, and clk as shown above, we can use m1 to compute various factorials.

M1 !>(m1 (make-state 0           ; pc
                     (list 6 0)  ; locals n = 6 and ans = 0
                     nil         ; stack (empty)
                     *pi*)       ; program
         (clk 6))
(14 (0 720)
    (720)
    ((ICONST 1)
     (ISTORE 1)
     (ILOAD 0)
     (IFEQ 10)
     (ILOAD 1)
     (ILOAD 0)
     (IMUL)
     (ISTORE 1)
     (ILOAD 0)
     (ICONST 1)
     (ISUB)
     (ISTORE 0)
     (GOTO -10)
     (ILOAD 1)
     (HALT)))

Observe the final state: the pc is 14, which points to the (HALT). Local 0, i.e., n, has been zeroed out but local 1, i.e., ans, contains 720. Furthermore, 720 has been pushed onto the stack. The program, of course, is unchanged. And by the way, 720 = 6*5*4*3*2*1.

Of course, we can define factorial recursively in ACL2 and check the m1 computation.

M1 !>(defun fact (n)
       (if (zp n)
           1
           (* n (fact (- n 1)))))

The admission of FACT is trivial, ...
...

M1 !>(fact 6)
720

We can compute larger factorials with m1, of course. Here's how we compute 1000! with m1.

M1 !>(m1 (make-state 0 (list 1000 0) nil *pi*) (clk 1000))

It takes SBCL about 0.02 seconds to compute the answer. The answer is a natural number with 2,568 decimal digits, so we won't show it here, but you can try it on your own. (Clk 1000) is 11,005, so during this particular computation, ACL2 was executing 550,250 m1 instructures per second. Our model would run faster if we used a single-thread object (see stobj) to hold the state and faster still if we verified the guards of m1. We point to discussions and examples of these ideas at the end of this topic.

Symbolic Execution of M1 Code

We can use the theorem prover to help us reverse engineer code. For example, what does the loop in *pi* do? We actually posed the question in Symbolic Execution Example 2 above, i.e., what is (m1 (make-state 2 (list n ans) stack *pi*) 11) when n and ans are natural numbers and n is not 0. We can pose a (non-)theorem to the prover to see. The lefthand side of the equality just simplifies to the nest of 11 steps from Symbolic Execution Example 2. The righthand side, below, is the variable ???, so this formula is obviously not a theorem, but the prover will simplify the formula and print the checkpoint.

(thm (implies (and (natp n)
                   (< 0 n)
                   (natp ans))
              (equal (m1 (make-state 2 (list n ans) stack *pi*) 11)
                     ???)))

Of course, the proof attempt fails. But if you look at the checkpoint you should recognize the simplified expression as being from Example 2.

Goal''
(IMPLIES (AND (INTEGERP N)
              (<= 0 N)
              (< 0 N)
              (INTEGERP ANS)
              (<= 0 ANS))
         (EQUAL (MAKE-STATE 2 (LIST (+ -1 N) (* ANS N))
                            STACK
                            '((ICONST 1)
                              (ISTORE 1)
                              (ILOAD 0)
                              (IFEQ 10)
                              (ILOAD 1)
                              (ILOAD 0)
                              (IMUL)
                              (ISTORE 1)
                              (ILOAD 0)
                              (ICONST 1)
                              (ISUB)
                              (ISTORE 0)
                              (GOTO -10)
                              (ILOAD 1)
                              (HALT)))
                ???))

Proving Theorems about M1 Programs

In this section we prove that *pi* computes factorial.

Step 0: Define the specification. We've already done that: *pi* allegedly computes (fact n) when n is a natural number. This is just shorthand for the more precise understanding that if we start at pc 0 with a natural, n, in local 0 and run program *pi* (clk n) steps, the final state has pc 14 (meaning execution reached the (HALT)), local 0 has been zeroed, local 1 contains (fact n), and (fact n) is on top of the otherwise unchanged initial stack.

Step 1: We start by defining an ACL2 function that “does what the loop does.” We sometimes call this the semantic function corresponding to the program.

(defun helper (n ans)
  (if (zp n)
      ans
      (helper (- n 1) (* ans n))))

Step 2: Prove that the loop does what we said it would do.

(defthm loop-correct
  (implies (and (natp n)
                (natp ans))
           (equal (m1 (make-state 2
                                  (list n ans)
                                  stack
                                  *pi*)
                      (loop-clk n))
                  (make-state 14
                              (list 0 (helper n ans))
                              (push (helper n ans) stack)
                              *pi*))))

This proof is completely “automatic” — but only because we stated the theorem exactly right, in terms of helper to “explain” what the loop is doing with the second local, ans. Once you get used to this style of proof, Step 2 is not really very creative: it just says what the program does operationally, using tail recursion in place of iteration. Of course, recursion is ACL2's bread and butter.

The proof of loop-correct is by the induction as suggested by (helper n ans). The induction scheme is

(AND (IMPLIES (AND (NOT (ZP N))
                   (:P (* ANS N) (+ -1 N) STACK))
              (:P ANS N STACK))
   (IMPLIES (ZP N) (:P ANS N STACK)))

If you work it out you'll see that in the induction conclusion (loop-clk n) expands to (clk+ 11 (loop-clk (- n 1))), and then the machinery we've been talking about reduces the induction conclusion to the induction hypothesis. Do it for yourself. You will see the simplifications done earlier in Symbolic Execution Example 2. (There is another basic case we haven't explicitly discussed: what happens when we're at the top of the loop but n is 0? That case is handled by symbolic execution: (loop-clk n) is 3, the IFEQ jumps to the exit to push ans onto the stack and advance the pc to the (HALT) at 14.)

Step 3: Relate the helper to the specification.

(defthm helper-is-fact
  (implies (and (natp n)
                (natp ans))
           (equal (helper n ans)
                  (* ans (fact n)))))

This is proved “automatically.”

Generally speaking, Step 3 is the most creative step because it requires explaining how the “iterative” (tail-recursive) accumulation of the answer relates to the recursive computation of the answer.

Step 4: Put it all together in a statement of the total correctness of *pi*.

(defthm correctness-of-*pi*
  (implies (natp n)
           (equal (m1 (make-state 0
                                  (list n ans)
                                  stack
                                  *pi*)
                      (clk n))
                  (make-state 14
                              (list 0 (fact n))
                              (push (fact n) stack)
                              *pi*))))

This is proved “automatically” because of our setup.

(Clk n) expands to (clk+ 2 (loop-clk n)) and the crucial m1-clk+ (“sequential execution”) rule shows it is equivalent to

(m1 (step (step (make-state 0
                            (list n ans)
                            stack
                            *pi*)))
    (loop-clk n))

But then the step rules can reduce that to:

(m1 (make-state 2
                (list n 1)
                stack
                *pi*)
    (loop-clk n))

at which point the loop-correct lemma can fire and produce

(make-state 14
            (list 0 (helper n 1))
            (push (helper n 1) stack)
            *pi*)

and then (helper n 1) is rewritten (fact n) by helper-is-fact and arithmetic, reducing the lefthand side to the righthand side. (The above description is inaccurate only in the sequencing of the various rewrites.) Q.E.D.

Of course, from correctness-of-*pi* we can easily derive weaker but simpler results to “advertise.”

(defthm corollary-of-correctness-of-*pi*
  (implies (and (natp n)
                (equal s_init (make-state 0 (list n ans) stack *pi*))
                (equal s_fin (m1 s_init (clk n))))
           (and (equal (top (stack s_fin)) (fact n))
                (equal (next-inst s_fin) '(HALT)))))

Viewing these results from a higher level, one might argue that the corollary above is a “better” theorem than correctness-of-*pi*. It just says, simply, if you have an initial state poised to execute *pi* on n and you run it (clk n) steps to some “final” state, that final state has (fact n) on top of the stack and is halted. The full correctness result we proved says some “irrelevant” things: in the final state, n is 0, (fact n) is also in local 1, and the original stack is intact under the answer.

But those facts are not irrelevant! They (or something like them) are mathematically necessary for the inductive proof. Remember, inductive proofs only work on sufficiently strong theorems. Furthermore, we really must care about these other issues even from the high level perspective. If, for example, our program is just a subroutine of a larger system (on a machine that supports procedure call and return) we really need to know that the stack is not disturbed since the results of prior computations may be sitting in it for use for future computations. In a setting where we're concerned about privacy and security (in a machine with addressable memory) we really need to know that information hasn't been moved from private space to public space. In a setting where programs are in writable memory, we really need to know that our program hasn't been overwritten by some subroutine.

More M1 Programs and Proofs

The proofs described above can be seen in the ACL2 book books/models/jvm/m1/fact.lisp. If the proof script here is taken as a template for all such proofs you'll find it a bit “fragile” in the sense that lemmas proved in one step of the template may get in the way of proofs in subsequent steps. That is generally dealt with by disabling certain lemmas when their work has been done. But, except for this issue of controlling rewriting a little more restrictively via disabling, this proof of the correctness of *pi* serves as a good template for clock-based total correctness proofs of single-loop m1 programs.

There are many other examples of proofs of m1 programs on the directory books/models/jvm/m1. Here is a brief guide.

books/models/jvm/m1/template.lisp: a “robust” template for proving the total correctness of one-loop m1 programs. The template is overly robust in the sense that it introduces often irrelevant intermediate functions, lemmas, and disable events. In the vast majority of cases the proofs can be carried out with fewer events. But until you understand how ACL2 behaves it's best to just rotely follow the template.
the files listed below contain simple one-loop m1 programming and proof challenges. Each file follows the template above. We recommend that you not look at the solutions and instead read the challenge programming/proof problem at the top of the file and then follow the template for yourself. We also recommend you do these problems in the order listed, as they gradually get harder.
- books/models/jvm/m1/sum.lisp
- books/models/jvm/m1/sumsq.lisp
- books/models/jvm/m1/fact.lisp
- books/models/jvm/m1/power.lisp
- books/models/jvm/m1/expt.lisp
- books/models/jvm/m1/alternating-sum.lisp
- books/models/jvm/m1/alternating-sum-variant.lisp
- books/models/jvm/m1/fib.lisp
- books/models/jvm/m1/lessp.lisp
- books/models/jvm/m1/even-solution-1.lisp
- books/models/jvm/m1/even-solution-2.lisp
- books/models/jvm/m1/sign.lisp
the files below deal with two-loop programs in which the loops are nested. The solutions use the same basic naming scheme as the template.
- books/models/jvm/m1/div.lisp
- books/models/jvm/m1/bexpt.lisp
books/models/jvm/m1/m1-fact-on-neg-runs-forever.lisp: we prove that if n is negative, the factorial program *pi* never reaches the halt. This problem is probably not suitable for you to work out on your own. But you can probably do others like it after understanding the proofs here. We prove the theorem two different ways. The first is a classic clock-style proof but the clock function decomposes an arbitrary natural number k > 1 into the form 2 + 11 + 11 + ... + 11 + (mod (- k 2) 11) and then an inductive proof shows that the program never gets out of the loop — and there's no HALT in the loop. The second uses the inductive assertion style proof carried out directly by symbolic evaluation of the operational semantics and induction. For an explanation of the use of inductive assertions with this style of operational semantics see bib::moore03b. That paper deals with the M5 machine (ACL2 directory books/models/jvm/m5/) but is easily understood in the context of M1.
books/models/jvm/m1/m1-half-via-inductive-assertions.lisp: define a program that halves its natural number input and terminates iff the input was even. Prove it. This example is discussed in bib::moore03b as an M5 program. It uses the inductive assertion method.
books/models/jvm/m1/magic.lisp illustrates the importance of termination. It presents a program that just successively replaces the top of the stack by its successor, starting from 0. If you look at the state at the right moment, you can find any natural number you want on top of the stack. So this program “implements” every natural valued function — if terminating isn't a requirement! In that sense, we prove this program implements fib and also implements factorial. (Of course, we could also constrain a function to return a natural and prove that the program implements that function, but we don't bother.)
books/models/jvm/m1/funny-fact.lisp: this shows a two-loop program in which one loop follows another rather than being nested inside the other. The first loop pushes successive naturals from n down to 1 onto the stack. The second loop repeatedly multiplies the top two elements together. Prove that implements factorial. Note: this “bytecode” program violates a basic principle of the Java bytecode verifier, which insists that the stack is always the same length upon every arrival at every instruction.
books/models/jvm/m1/wormhole-abstraction.lisp: yet another proof of a factorial program, but it shows a way you can avoid explicitly specifying intermediate values you don't care about. This idea was first demonstrated by Dave Greve of Collins Aerospace.
books/models/jvm/m1/m1-with-stobj.lisp: M1 has been defined in a “constructor” style: every new state constructed as a list of four elements where all the elements are specified. In more realistic machines, e.g., ACL2 directory books/projects/x86isa/, the state is a single-threaded object (see stobj) that is destructively modified to effect state changes. This produces much more efficient runtime execution. However, it slightly changes the form of correctness proofs. In m1-with-stobj.lisp we define the M1 machine using a single-threaded object as the state. Then, in
- books/models/jvm/m1/m1-with-stobj-clock-example.lisp
we implement multiplication by repeated additions (avoiding use of the IMUL instruction) on that version of M1 and prove it correct.
books/models/jvm/m1/defsys.lisp: a verifying compiler from a very simple “Toy Lisp” to m1. Toy Lisp is just the subset of ACL2 composed of variable symbols, quoted numeric constants, the function symbols +, -, * (primitively supported by M1), the form (MV a1 ... an) for returning multiple values, the form (IFEQ a b c) (which is just ACL2's (if (equal a 0) b c)), and calls of primitive and defined Toy Lisp functions. The compiler takes a list of Toy Lisp “modules,” each of which specifies a symbolic name, formals, pre-conditions, post-conditions, and a Toy Lisp implementation. An example module is
```
(lessp :formals (x y)
       :input (and (natp x)           ; pre-condition
                   (natp y))
       :output (if (< x y) 1 0)       ; post-condition
       :code (ifeq y                  ; Toy Lisp implementation
                   0
                   (ifeq x
                         1
                         (lessp (- x 1) (- y 1)))))
```
The compiler implements a call/return protocol so that one Toy Lisp function can use another. The compiler produces M1 code for each module. In addition, it produces the clock function for each module and it generates events that prove that the M1 code implements input/output specification. The result is an M1 state poised to call the module main on whatever arguments are on the m1 stack. Warning: As of August, 2024, the comments in defsys.lisp are out of date!
books/models/jvm/m1/theorems-a-and-b.lisp: a proof that M1 can compute anything a Turing machine can compute. See the paper bib::moore14.
The ACL2 file books/projects/codewalker/README describes a tool called “codewalker.” Codewalker is a utility for exploring code in any programming language specified by an ACL2 operational model. Three main facilities are provided: the abstraction of a piece of code into an ACL2 “semantic function” that returns the same machine state, the definition the “clock function” for the code, and the “projection” of the semantic function into another function that computes the final value of a given state component using only the values of the relevant initial state components. No paper about codewalker is available. But the first 50 pages of codewalker.lisp is a comment providing documentation. In addition, several examples are provided in other files. For example, the ACL2 book books/projects/codewalker/demo-fact.lisp uses codewalker to verify an M1 factorial program. By the way, the version of M1 used there is defined on the codewalker/ directory rather than the books/models/jvm/m1/ directory to allow the two versions to drift apart. The M1 used by the codewalker demo is defined in books/projects/codewalker/m1-version-3.lisp and it uses a single-threaded object (see stobj) representation of the state.

More Elaborate Models of the JVM: From M1 to M6

M1 is only the start of a series of models of the JVM. M1 lacks a bytecode verifier, method invocation, threads, and other features which are explored in the more elaborate models, all of which are found on the ACL2 directory books/models/jvm/.

The ACL2 directory books/models/jvm/guard-verified-m1/ is another version of m1 that differs in two ways. First, instead of using the constructor style of creating new states (e.g., with (list pc locals stack program)) it uses a single-threaded object (see stobj). This slightly changes the forms of corretness theorems and lemmas. Second, the definition of m1 on the books/models/jvm/guard-verified-m1/ directory has been guard verified. This involves attaching a guard to every “execute” function and lifting that up to the level of the m1 function. The guards include the notion of a “good state” which includes the notion of a well-formed program. But that necessarily involves the requirement that, say, when every IADD instruction is encountered the stack has at least two numbers on it. The good-statep predicate is essentially a simple bytecode verifier for m1: it tells us that if a state is good (meaning, the program in it is good), then executing the program guarantees all the properties necessary for error-free execution. Verifying the guards of this version of m1 demonstrates “verification of the bytecode verifier.” The directory also contains the examples found in the unguarded version of m1 on books/models/jvm/m1. The reason we provide two functionally equivalent versions of m1 is that when teaching how to use ACL2 to specify a machine operationally, it is easiest to start with the unguarded m1.
M2, on the ACL2 directory books/models/jvm/m2/ is essentially M1 with method invocation (both static and virtual), return, and class instances including a demonstration of inheritance. See bib::moore99a. In that paper the machine is named “tjvm” for “Toy JVM” but its proper name is M2. The paper and ACL2 scripts illustrate how to reason about procedure call and return.
M5, on the ACL2 directory books/models/jvm/m5/ is a closer approximation to the JVM. M5 was written by George Porter, as an undergraduate project. M5 supports 195 JVM instructions, including those for several kinds of arithmetic, arrays, and objects, threads and monitors. The handling of integer arithmetic is accurate, e.g., all integers are bounded and addition, say, wraps around. Floating point is not accurate and we regard it as a mere placeholder. The directory contains proofs of properties for several M5 programs, including iterative and recursive ones (dealt with via clock-based proofs), some partial correctness proofs about programs that do not always terminate (dealt with via the inductive assertion method), and the correctness of an applicative insertion sort method (on Objects representing cons trees). See bib::mp02 and bib::moore03a. A proof of a mutual-exclusion property, called “the Apprentice Challenge,” is described in the papers and the corresponding books containing proofs are available on the m5/ directory.
M6, on the ACL2 directory books/models/jvm/m6/ is the best approximation to the JVM. M6 was written as part of Hanbing Liu's PhD dissertation, supported by Sun Microsystems. The M6 state includes an “external class table” where classes reside until they are loaded. M6 can execute most J2ME Java programs (except those with significant I/O or floating-point). It was primarily designed to model dynamic class loading, class initialization, and bytecode verification. The entire Sun CLDC API library (672 methods in 87 classes) was translated into the external representation and is available for loading, constituting about 500 pages of data. The model included 21 out of 41 native APIs that appeared in Sun's CLDC API library. The M6 description itself is about 160 pages of ACL2. See bib::liu06. Warning: M6 is not yet in the regression!

Quick Index to Related Topics

operational-semantics
operational-semantics-1__simple-example — current topic
operational-semantics-2__other-examples
operational-semantics-3__annotated-bibliography
operational-semantics-4__how-to-find-things
operational-semantics-5__history-etc