Search-engine friendly clone of the ACL2 documentation.

Operational-semantics-3__annotated-bibliography

Bib::liu06

H. Liu, Formal Specification and Verification of a JVM and its Bytecode Verifier, University of Texas at Austin, Ph.D. dissertation, 2006

Relevance: most complete ACL2 model of the JVM

———

The source files for this effort have been updated in October, 2024 (because of changes to ACL2 since 2006) and now reside in the community-books, in directory books/models/jvm/m6/.

This dissertation, available at the link above, describes the “M6” model of the JVM and some important proofs about it. It was the most complete model constructed in ACL2. It executed most J2ME Java programs (except those with significant I/O or floating-point). It supported

all data types (except floats),
multi-threading,
dynamic class loading,
class initialization, and
synchronization via monitors.

The M6 state includes an “external class table” where classes reside until they are loaded. The entire Sun CLDC API library (672 methods in 87 classes) are translated into the external representation and are available for loading, constituting about 500 pages of data. The model includes 21 out of 41 native APIs that appeared in Sun's CLDC API library. The M6 description itself is about 160 pages of ACL2.

It was created with support from Sun Microsystems.

On page 322, Liu describes the size of the proof effort: “We have defined 2104 ACL2 functions. We proved over 4764 theorems. 196 of them are “skip-proofed”. To prove the rest of them, 971 inductions are necessary. The total line count of the ACL2 input is 136000. They are organized in 282 files. The dependency graph between the 282 ACL2 books has 2012 edges.”

Note from J Moore, dissertation supervisor: The skip-proofs tag on a “theorem” means the conjecture is assumed rather than proved. So having even one skip-proofs makes it impossible to fully trust the results. However, the skipped conjectures in this work are relatively low-level results very similar to other theorems already proved. Liu and I agreed that these conjectures were very probably provable and we were more interested in the higher-level conclusions drawn from those assumptions.

An informal summary of the dissertation, by Liu, states

Will a bytecode-verified Java program run safely? Why? What do we mean by “safely”?

This dissertation sets out to address these questions by first building a precise model of the JVM and its bytecode verifier.

The models are written in ACL2, for which we have computer-aided mechanical reasoning support.

We studied how static type checking during bytecode verification can provide runtime type safety. We did this by examining the JVM type hierarchy and proving that safety conditions on operations enforced by the bytecode verifier can guarantee runtime type safety of the program.

We also studied how the JVM type hierarchy is extended in a safe way by the JVM class loader that introduces new types into the environment. We proved a set of theorems stating that the class loading process preserves the consistency of the JVM type hierarchy.

We studied a third aspect of the JVM safety. We focused on the safety across method invocation and return. We studied why the bytecode verifier can examine each method individually and provide a safety guarantee about the runtime execution of a program. We note that the execution of a program is characterized by a stack of partially executed methods and a current executing method.

We used the full JVM model in ACL2 to study the first two aspects of JVM safety. We proved a set of theorems that stated the effectiveness of the bytecode verification about individual operations and safety of the JVM class loading.

We used a simplified JVM model, which we call the small machine, to study the third aspect of the JVM safety. The small machine has only one data type. It has branching instructions and method invocation and return instructions. We proved that any small program verified by the corresponding simplified JVM bytecode verifier can execute safely without overflowing or underflowing the operand stacks.