M1
package.
ILOAD
, leads us to write
(defun execute-ILOAD (inst s) (make-state (+ 1 (pc s)) (locals s) (push (nth (arg1 inst) (locals s)) (stack s)) (program s)))Define the other seven instructions and come prepared on Monday to present them at the board.
ILOAD
, ICONST
, IADD
, ISUB
, IMUL
, ISTORE
,
and GOTO
. Define the last instruction, IFEQ
, and
then answer the questions on pages 18 and 19 of the lecture. That will
complete the definition of M1! I strongly urge you to build your own M1 and
run it on some example programs.
(implies (and (natp x) (natp y)) (equal (g x y 0) (* x y)))where
(defun g (x y a) (if (zp x) a (g (- x 1) y (+ y a)))).The proof required two key steps: (a) generalization from
(g x y 0)
to (g x y a)
, and
(b) induction. Here is an ACL2 script and the session log.
You should be able to use the ACL2 system well enough to replay the script!
g
above. Specifically, we defined
(defconst *g-program* '((ICONST 0) ; 0 (ISTORE 2) ; 1 a = 0; ; loop: (ILOAD 0) ; 2 (IFEQ 10) ; 3 if x=0 then go to end; (ILOAD 0) ; 4 (ICONST 1) ; 5 (ISUB) ; 6 (ISTORE 0) ; 7 x = x-1; (ILOAD 1) ; 8 (ILOAD 2) ; 9 (IADD) ;10 (ISTORE 2) ;11 a = y+a; (GOTO -10) ;12 go to loop ; end: (ILOAD 2) ;13 (HALT) ;14 ``return'' a (actually, halt with a on stack). ))and tested it on several inputs. This program may be described as taking natural number inputs
x
and y
(in locals 0 and 1)
and computing x*y
by adding y
to an initially 0
accumulator, a
. The program ``returns'' the final value
of a
by leaving it on the stack and halting. Compare *g-program*
the JVM bytecode produced for the method g
in the Java file Demo.java
below:
class Demo { public static int g(int x, int y){ int a = 0; while (x!=0) {x = x-1; a = y+a;} return a; } public static void main(String[] args){ int x = Integer.parseInt(args[0], 10); int y = Integer.parseInt(args[1], 10); System.out.println(g(x,y)); return; } }by compiling
Demo.java
and inspecting the bytecode with
javap -c DemoTo test
*g-program*
(1) follow the instructions here to
get the full definition of M1 available to your ACL2, (2) start your ACL2 and
do:
(include-book "m1/m1") ; The first `m1/' indicates the directory where you installed m1.lisp. (in-package "M1") ; To get into the M1 package.and finally (3), try something like:
(run (repeat 'tick 50) (make-state 0 (list 4 5) nil *g-program*))to run the program on inputs 4 and 5. Inspect the final state and see if it meets your expectations. Try other inputs. I also recommend that you define and test other simple M1 programs, like factorial, sum the numbers below n, and perhaps one that indicates whether a natural number is even or odd by pushing 1 or 0 on the stack.
*g-program*
. It is written as a 7 step
template. I recommend that you (a)
replay this template for *g-program*
in your ACL2 and then
modify it repeatedly to verify some other
simple M1 programs
s
'' and ``the indices of the variables
(possibly) written by the program in s
.'' The latter example is
answered perfectly by:
(all-written-locals (program s)), where (defun all-written-locals (program) (cond ((endp program) nil) ((equal (op-code (car program)) 'ISTORE) (cons (arg1 (car program)) (all-written-locals (cdr program)))) (t (all-written-locals (cdr program)))))
helper-is-theta
and fn-is-theta
theorems) from the template.
I also noted that the midterm handout will include m1.lisp
,
m1-support.lisp
, and template.lisp
. The test will
be open notes, but I urge you not to cut down a lot of trees! The
handouts included above are almost certain to contain all the examples you
will need.
We then looked at some instructions in the
The Java Virtual Machine Specification and worked out how their analogues could be formalized in the
context of M1, specifically
IF<cond>
,
IF_ICMP<cond>
,
DUP
,
DUP_X1
,
SWAP
,
JSR
, and
RET
.
Finally, I briefly talked about stack maps but we'll cover them in more detail in the future. They will not be on the exam, but the algorithm for computing them requires the kind of formalization skills previously mentioned.
IFEQ
in it (so the schedule is more complicated) and
there are two distinct exit points (so the specification of the final pc is more complicated). But the moral is
simple: use IF
to say what is true. Remember the midterm is this coming Wednesday, Mar 7.
See the description of the midterm in the bullet above.
By way of comparison, M1, which we have been studying until now, is less than 4 pages of ACL2 code. M5 is about 28 pages of ACL2 code. It will allow us to study the formalization of the features mentioned above while still being something you can master in the time remaining. Our most accurate model of the JVM, M6, is 160 pages of ACL2 code plus about 500 pages of built-in classes from the Java API.
I have posted M5 at m5/README.html. I urge you certify those books on your local version of ACL2, then:
(include-book "m5/m5") (in-package "M5") (defconst *s* '(STATE (:TT ((THREAD (:ID 0) (:CS ((FRAME (:PC 0) (:LOCS NIL) (:STK (5)) (:MLOC ("Math" "main" 2))))) (:STAT ACTIVE) (:REF NIL)))) (:HP NIL) (:CT ((CLASS (:NAME "Object") (:SUPERS NIL) (:FIELDS ("monitor" "mcount" "wait-set")) (:METHODS NIL)) (CLASS (:NAME "Thread") (:SUPERS ("Object")) (:FIELDS NIL) (:METHODS ((METHOD (:NAME "run") (:FORMALS NIL) (:SYNC NIL) (:CODE ((RETURN))) (:XTBL NIL)) (METHOD (:NAME "start") (:FORMALS NIL) (:SYNC NIL) (:CODE NIL) (:XTBL NIL)) (METHOD (:NAME "stop") (:FORMALS NIL) (:SYNC NIL) (:CODE NIL) (:XTBL NIL))))) (CLASS (:NAME "Math") (:SUPERS ("Object")) (:FIELDS NIL) (:METHODS ((METHOD (:NAME "fact") (:FORMALS (N)) (:SYNC NIL) (:CODE ((LOAD 0) (IFEQ 8) (LOAD 0) (LOAD 0) (CONST -1) (ADD) (INVOKESTATIC ("Math" "fact" 1)) (MUL) (XRETURN) (CONST 1) (XRETURN))) (:XTBL NIL)) (METHOD (:NAME "app") (:FORMALS (X Y)) (:SYNC NIL) (:CODE ((LOAD 0) (LOAD 1) (INVOKESTATIC ("Math" "fact" 1)) (ADD) (XRETURN))) (:XTBL NIL)) (METHOD (:NAME "main") (:FORMALS NIL) (:SYNC NIL) (:CODE ((INVOKESTATIC ("Math" "fact" 1)) (INVOKESTATIC ("Math" "app" 2)))) (:XTBL NIL)))))))))and then practice accessing pieces of this state by choosing a piece to target and then evaluating an expression that is supposed to return it. For example, the ``mloc of the top frame on the call stack of thread 0'' is computed by either
(mloc (top (cs (find :id 0 (tt *s*)))))or
(get :mloc (top (get :cs (find :id 0 (tt *s*)))))both of which return
("Math" "main" 2).Make up examples of your own and test your thinking.
INVOKESTATIC
and xRETURN
. The
example I used and the commands to ``snapshot'' certain intermediate states are here.
I recommend that you think of a simple, recursive arithmetic function (e.g., Fibonnaci, sum of squares,
greatest common divisor, etc.) and code it up as a recursive method in M5 and then snapshot its
execution to cement your understanding of the basic call and return mechanism. Then in your local model
of M5 you might change the notion of method resolution so that the method invoked is influenced by the
number of arguments in the descriptor provided to INVOKESTATIC
. That is, instead of finding
a method of the given name in the super class chain of the given class, find the method of a given name
and input arity in the super class chain. Then code an example class with two methods with the same
name and demonstrate that your semantics resolves to the ``right'' method.
NEW
, GETFIELD
, and PUTFIELD
.
The example I used and the commands to snapshop certain intermediate states are here.
However, I modified the script from that used in the lecture so that (a) I popped the extraneous 1 off the
stack after demonstrating the behavior of GETFIELD
and (b) I eliminated certain
intermediate states in the snapshots, just showing the before and after shots around the relevant
NEW
, PUTFIELD
, and GETFIELD
instructions.
INVOKEVIRTUAL
and INVOKESPECIAL
.
The M5 demonstration I gave is here. The comments in the
demo mention two Java files, Point.java and
ColoredPoint.java that, when compiled with
javac, illustrate actual JVM bytecode similar to (but not identical to) the
M5 demo. At the end of the lecture script is a script that answers the
challenge raised last time: how do you build a circular Object of n
linked Objects on M5?
*r*
of lecture16.lisp. Then I presented a puzzle in which two threads
are endlessly competing for a single resource, a counter initially 1, where each thread reads the
counter twice, adds the results, and stores it back to the global counter. The question is whether
you can invent a schedule that will make this system, called *s*
in lecture16.lisp,
produce any given natural number n
by defining (schedule n)
appropriately.
I urge you to think about this -- and not to search for the answer online! Furthermore, when you
think about it, don't think about M5 state *s*
but think about the problem in a more
abstract setting. If you think you know how to solve it, formalize in ACL2 a simple model of this
problem (from scratch, not involving M5) and test your solution there. It is almost always pointless to
analyze actual code, even for a machine as simple as M5, before you've analyzed an appropriate
abstraction. Indeed, harking back to our M1 code proofs: first you prove the algorithm correct,
then you prove that the M1 code implements the algorithm. So for this example the lesson is:
first you produce an abstract model of the puzzle and solve it, then you implement that solution
in M5 bytecode.
MONITORENTER
and MONITORENTER
. The M5
demo I gave is lecture17.lisp. I also discussed
the Apprentice Challenge, which involves
proving that a certain trivial Java program achieves mutual exclusion but
exposes a delicacy: one must not change the reference object of a thread once
the thread has been started.
THROW
and the exception table associated with every method. The M5
demo I gave is lecture18.lisp.
We will study a formal specification of the Java Virtual Machine (JVM). The JVM is a stack-based, object-oriented, type-safe bytecode (assembly language) interpreter on which compiled Java programs are executed.
But the focus of the course will be on teaching you how to formalize a comparably complicated computing artifact and how to subsequently use that formalization. That is, we will be more interested in formalization techniques than in the JVM specifically.
You will learn about four different things: how to make a mathematical model of a complicated digital artifact like the JVM, how to program in a simple functional language, how to reason about such models, and how to use a powerful automatic reasoning tool.
/p/bin/acl2
. Most users install it on their
laptops (see below). And most users either run it in an Emacs shell buffer
or via an Eclipse (``ACL2 Sedan'') interface (see below).
Extra Credit: Extra credit will be given for projects presented at the end of the semester. Possibilities for projects will be discussed from time to time in class. If you have a project proposal, discuss it with me before you invest time in it. You may work with others on projects.
Upper-division standing is required for all CS378 classes.
If the question is “What do I have to know in order to do well in this course?” as opposed to “What are the university rules?” the answer is: mathematical logic, including induction, and some experience programming in some language, preferably Java. You should be able to use Eclipse or Emacs. Experience with Lisp or ACL2 is helpful but the subset we use is relatively small and will be taught (quickly).
See How to Use ACL2s to get started.
We'll approach the JVM model incrementally, starting with a very simple (suggestive but inaccurate) model. Then we will extend and revise it repeatedly toward a more accurate description of the JVM. We'll learn the necessary functional programming and proof techniques by building the simplest model. Most of the semester will be spent extending and exploring more elaborate models.
You will be expected to do much of the formalization work here and extra-credit project ideas may come out our discussions. For example, good projects might include the formalization or elaboration of features not dicussed in class or the mechanized proofs of some of the properties discussed.
We will adhere pretty closely to the following sequence of topics. But since many classes will be presentations by students in answer to questions raised by the instructor, the pace may vary somewhat.
All dates below speculative.
Wed, Jan 18 | Introduction |
Mon, Jan 23 | building M1 -- functional programming in ACL2 |
Wed, Jan 25 | building M1 -- functional programming in ACL2 |
Mon, Jan 30 | building M1 -- functional programming in ACL2 |
Wed, Feb 1 | reasoning about M1 “by hand” |
Mon, Feb 6 | reasoning about M1 “by hand” |
Wed, Feb 8 | quick introduction to how ACL2's prover works |
Mon, Feb 13 | mechanized proofs about M1 |
Wed, Feb 15 | mechanized proofs about M1 |
Mon, Feb 20 | mechanized proofs about M1 |
Wed, Feb 22 | mechanized proofs about M1 |
Mon, Feb 27 | the class table, the heap, and threads |
Wed, Feb 29 | macros for managing an elaborate state |
Mon, Mar 5 | M5 — a fairly realistic JVM model |
Wed, Mar 7 | Midterm Test |
Mon, Mar 12 | Spring Break |
Wed, Mar 14 | Spring Break |
Mon, Mar 19 | object creation and manipulation |
Wed, Mar 21 | method resolution and invocation |
Mon, Mar 26 | threads and monitors |
Wed, Mar 28 | M5 |
Mon, Apr 2 | mechanized proofs about M5 |
Wed, Apr 4 | mechanized proofs about M5 |
Mon, Apr 9 | mechanized proofs about M5 |
Wed, Apr 11 | mechanized proofs about M5 |
Mon, Apr 16 | extending M5 |
Wed, Apr 18 | extending M5 |
Mon, Apr 23 | extending M5 |
Wed, Apr 25 | M6—an accurate JVM model |
Mon, Apr 30 | M6—an accurate JVM model |
Wed, May 2 | Last Test |
A mathematical logic is a formal system consisting of a precisely defined syntax, some axioms, and some rules of inference. The axioms are just formulas in the syntax — formulas that are taken to be ``always true.'' The rules of inference are formula transformers that preserve truth. A theorem is a formula that can be derived from the axioms by applying the rules of inference. A theorem is thus ``always true.'' By modeling a computing system in a mathematical logic we can prove theorems about it to establish its properties.
You studied formal mathematical logic in CS313K and in CS336. There you learned propositional calculus as a formal system. You also learned first order predicate calculus. You might have also learned set theory. So which mathematical logic do we use to describe the Java Virtual Machine?
The mathematical logic we use is a functional programming language, Pure Lisp. If you know anything at all about Lisp, you probably think of it as merely a programming language. But we cast it as a logic, with a precisely given syntax, some axioms, and some rules of inference. We will prove theorems in Lisp.
Put another way, in this course you will come to understand the JVM by studying a model of the Java Virtual Machine written in a funtional programming language.
We will cover representatives of most of the JVM byte codes, including
IADD
, ILOAD
, ISTORE
,
IFGT
, GOTO
, NEW
,
PUTFIELD
, INVOKEVIRTUAL
, and
MONITORENTER
. We will not cover the entire JVM — for example,
we will not deal with the details of arithmetic, arrays, class loading, or
native methods. However, by the end of this course you will be able to write
formal specifications of many of the omitted parts.
We will discuss the Java bytecode verifier; in particular, we will investigate its specification: what properties should it have?
The logic we use is supported by a mechanical theorem prover, ACL2. This theorem prover is in use in industry to verify properties of hardware, microcode, and software. In fact, its authors won the 2005 ACM Software System Award for the lasting influence their theorem provers have had on computer science.
This course is an unusual mixture of many CS courses. It is like CS307 in that we will be dealing with the Java programming language. It is like CS310 in that we will be looking at an assembly level language. It is like CS352 in that we will be considering the architectural features of the processor. It is like CS372 in that we will be considering process management, memory management, protection, thread scheduling, and concurrency. It is like CS313K in that we will be dealing with a formal logic. It is like CS336 in that we will be formally modeling and proving theorems about our programs and algorithms. It is like parts of CS343 in that we will be discussing mechanized reasoning.
Religious Holy Days: A student who is absent from an examination or cannot meet an assignment deadline due to the observance of a religious holy day may take the examination on an alternate day, submit the assignment up to 24 hours late without penalty, or be excused from the examination or assignment, if proper notice of the planned absence has been given. Notice must be given at least fourteen days prior to the classes scheduled on dates the student will be absent. For religious holy days that fall within the first two weeks of the semester, notice should be given on the first day of the semester. It must be personally delivered to the instructor and signed and dated by the instructor, or sent via certified mail, return receipt requested. Email notification will be accepted if received, but a student submitting such notification must receive email confirmation from the instructor. A student who fails to complete missed work within the time allowed will be subject to the normal academic penalties.
Disability Related Needs: Please notify me of any modification/adaptation you may require to accommodate a disability-related need. You will be requested to provide documentation to the Office of the Dean of Students in order that the most appropriate accommodations can be determined. Specialized services are available on campus through Services for Students with Disabilities, SSB 4th floor, A5800, 471-6259, TTY 471-4641
Emergencies and Illness: Documented emergencies and illnesses will be dealt with by the instructor. For best results, communicate with me before you miss a midterm or the final and be prepared to supply written, verifiable evidence of the condition.
Code of Conduct: For important other advice about expectations and conduct, see The Computer Sciences Department Rules to Live By.