Modeling nondeterminism inherent in y86 system calls Matt Kaufmann ACL2 Seminar April 23, 2013 Outline: I. Introduction II. Syscall-write in the logic and in raw Lisp III. Oracle implementation IV. Logical story ------------------------------------------------------------ I. Introduction ------------------------------------------------------------ In this talk I present a way to model the nondeterminism inherent in system calls. Thanks to Shilpi Goel for helping me to talk this through. It might help the reader to be familiar with Soumava Ghosh's presentation made just before this one in the ACL2 seminar. I'll focus on syscall-write (but similar considerations apply to syscall-open, syscall-close, and syscall-read). Here is a naive approach: (syscall-write filedes buffer nbytes) But this isn't a function! Problem: How can we model syscall-write so that it's a function? My first approach, when Soumava wanted help modeling nondeterminism, was to introduce (syscall-write clk filedes buffer nbytes) and then redefine it in raw Lisp. I used a trusted clause processor to explain the evaluation results; I could talk about that sometime if there's interest, but since I abandoned that approach I won't say much about it now. The point was to introduce a clock, clk, with the expectation that each call is made with a different value of clk, which might be extracted from the y86 state. But we could get different values from the same clk in different sessions. That's especially problematic if we can prove in two different sessions that the same call of syscall-write yields two different values, since that could allow us to certify books with contradictory theorems, and then include them both to prove nil! My solution was to store a structure in the y86 state that is richer than a clock: a so-called oracle. The oracle is intended to hold all results of syscall-write calls (and other syscalls) starting with that y86 state. That's fine for reasoning, but we don't actually want to build an oracle when running the model on concrete data. Thus, we want a logical story that justifies whatever is actually returned by a raw-Lisp version of syscall-write. ------------------------------------------------------------ II. Syscall-write in the logic and in raw Lisp ------------------------------------------------------------ Let's start top-down. Here are the definitions for syscall-write. I'll talk about pop-x86-32-oracle later below, but the idea is simple: it uses the oracle to get the next value for a syscall operation. ; Include the y86 model and introduce a trust tag (to support ; redefinition in raw Lisp, below). (include-book "y86-basic-new/y86/y86") (defttag :syscall-raw-impl) ; Here is the simple logical definition, which gets the result by ; popping the oracle. (defun syscall-write-logic (filedes buffer nbytes x86-32) (declare (xargs :stobjs x86-32) (ignore filedes buffer nbytes)) (mv-let (val x86-32) (pop-x86-32-oracle x86-32) (mv (list (ifix val)) x86-32))) ; The following wrapper is logically defined in terms of the logical ; definition, but is defined differently in raw Lisp (see below). (defun syscall-write (filedes buffer nbytes x86-32) (declare (xargs :stobjs x86-32 :guard (and (i32p filedes) (n32p nbytes))) (type string buffer)) (syscall-write-logic filedes buffer nbytes x86-32)) ; Next we overwrite the above definition in raw Lisp, using a ; definition based on the one originally provided by Soumava. I added ; the mv returning the stobj (whose oracle has logically changed), as ; well as an early exit using the logical definition except when we ; have a true "live" stobj whose oracle is of a special, "hidden" sort ; (described in Section III). (include-book "tools/include-raw" :dir :system) (include-raw "systemcalls-raw.lsp") ; From systemcalls-raw.lsp: (defun syscall-write (filedes buffer nbytes x86-32) (when (not (and (live-stobjp x86-32) (x86-32-oracle-hidden-p x86-32))) (return-from syscall-write (syscall-write-logic filedes buffer nbytes x86-32))) (setq ptr (ccl::make-cstring buffer)) (setq ret (ccl::external-call "syscall" :signed-int 1 :signed-int filedes :address ptr :unsigned-int nbytes :signed-int)) (ccl::dispose-heap-ivector ptr) (mv (cons ret nil) x86-32)) Here is a little demo. ACL2 !>(include-book "systemcalls") [[ ... output elided ... ]] "/v/filer4b/v41q001/kaufmann/projects/x86/svn-y86-syscall/matt/systemcalls.lisp" ACL2 !>(update-x86-32-oracle-from-state x86-32 state) ; explained in Part IV ( ) ACL2 !>(syscall-write 1 ; stdout "hello" ; buffer, a string (1+ (length "hello")) ; might not need 1+ x86-32 ) hello((6) ) ACL2 !>(mv-let (result x86-32) (syscall-write 1 ; stdout "hello" ; buffer, a string (1+ (length "hello")) ; might not need 1+ x86-32) (mv (equal '(6) (prog2$ (cw "~%~%some newlines~%~%") result)) x86-32)) hello some newlines (T ) ACL2 !>; The following had better fail! (thm (mv-let (result x86-32) (syscall-write 1 ; stdout "hello" ; buffer, a string (1+ (length "hello")) ; might not need 1+ x86-32) (declare (ignore x86-32)) (equal '(6) result))) [[ ... output elided ... ]] ******** FAILED ******** ACL2 !> The reason it's important for the THM form above to fail is that different calls of syscall-write might execute to return different values. In fact, there is no way for a "live" x86-32 stobj with a "hidden" oracle to be encountered in a term processed by the prover. We'll return to this point below. ------------------------------------------------------------ III. Oracle implementation ------------------------------------------------------------ I'll start by describing modifications I made to: /projects/acl2/devel/books/models/y86/y86-basic/common/x86-state.lisp First, I added two new fields in (defstobj x86-32 ...): x86-32-oracle x86-32-oracle-hidden-p I don't want to provide direct access to the oracle when the `hidden' bit is set, I introduce new renamings in the same stobj definition: (x86-32-oracle unsafe-x86-32-oracle) (update-x86-32-oracle unsafe-update-x86-32-oracle) Then we can define x86-32-oracle as follows, to prevent reading of the oracle field when the hidden bit is set. (defun x86-32-oracle (x86-32) (declare (xargs :stobjs x86-32)) (prog2$ (and (x86-32-oracle-hidden-p x86-32) (er hard? 'x86-32-oracle "Attempt to apply ~x0 to an x86-32 stobj with a hidden ~ oracle!" 'x86-32-oracle)) (unsafe-x86-32-oracle x86-32))) And similarly we define a safe updater, because we don't want to be able to modify the oracle in a way that won't be reflected by our system calls. (defun update-x86-32-oracle (new-oracle x86-32) (declare (xargs :stobjs x86-32)) (prog2$ (and (x86-32-oracle-hidden-p x86-32) (er hard? 'update-x86-32-oracle "Attempt to apply ~x0 to an x86-32 stobj with a hidden ~ oracle!" 'update-x86-32-oracle)) (unsafe-update-x86-32-oracle new-oracle x86-32))) We then remove access to the unsafe versions of the oracle accessor and updater. (push-untouchable unsafe-x86-32-oracle t) (push-untouchable unsafe-update-x86-32-oracle t) We may now define the operation for obtaining a value from the oracle. For convenience, we treat the oracle as an association list, where each entry associates an eip (program counter) with a list of values. Shilpi and I believe that this structure will make it easier to reason about oracles during proofs than if the oracle were just a flat list of values. We use functions safe-assoc and safe-put-assoc that are variants of assoc whose guards do not require the oracle to be an association list. (defun pop-x86-32-oracle (x86-32) (declare (xargs :stobjs x86-32)) (let* ((oracle (x86-32-oracle x86-32)) (oracle (if (consp oracle) oracle nil)) (eip (eip x86-32)) (vals (safe-assoc eip oracle)) (val (car vals)) (x86-32 (update-x86-32-oracle (safe-put-assoc eip (cdr vals) oracle) x86-32))) (mv val x86-32))) Recall that the oracle is not actually used for execution when we have a "live" y86 state with a "hidden" oracle. How do we make the oracle "hidden"? We get it by reading, just once, the state's oracle! (defund update-x86-32-oracle-from-state (x86-32 state) (declare (xargs :stobjs (x86-32 state))) (mv-let (erp val state) (read-acl2-oracle state) (let* ((x86-32 (update-x86-32-oracle (and (null erp) val) x86-32)) (x86-32 (update-x86-32-oracle-hidden-p t x86-32))) (mv x86-32 state)))) Then we make sure that the user can't unhide an oracle (which isn't really there in the implementation), as follows. (push-untouchable update-x86-32-oracle-hidden-p t) At this point, the only way to make a "hidden" oracle is by calling update-x86-32-oracle-from-state on a "live" state. We might use with-local-stobj to create a live x86-32 during a proof, but there is no way (without misusing trust tags) to create a live state that occurs in a term processed by the prover. ------------------------------------------------------------ IV. Logical story ------------------------------------------------------------ To review: we have introduced an oracle field of the y86 state that logically contains all the values returned by system calls. Note that there is no trust tag needed for the y86 model. Thus, the logical story is simple and clean. Well, except: how do we explain execution results that use the raw-Lisp code for syscall-write? Normally, when we get an evaluation result in the top-level ACL2 loop, we expect that the input is provably equal to the output. Consider the following, for example. ACL2 !>(+ 3 4) 7 ACL2 !> We then expect that the input and output are provably equal, and in our example above, that is verified by ACL2's prover: ACL2 !>(thm (equal (+ 3 4) 7)) Q.E.D. Summary Form: ( THM ...) Rules: NIL Time: 0.00 seconds (prove: 0.00, print: 0.00, other: 0.00) Prover steps counted: 1 Proof succeeded. ACL2 !> But when we run the y86 in the loop and get a result that depends on system calls, then in what sense do we have a theorem? We have a theorem with respect an input oracle that happens to have all the correct system call results ready to be plucked off. That oracle is, logically, the car of the oracle of the ACL2 state. The beauty part is that we can just assert that the ACL2 state's oracle happened to have that value at the time we consulted its oracle! Soumava Ghosh has pointed out that this approach gives us an interpreter that can allow behaviors that don't occur in practice, such as reading a file that hasn't been opened. I've discussed this issue a bit with him and Shilpi. I think we agreed that it may not be necessary to restrict the model to rule out such behaviors. But if that becomes necessary, it will probably be possible to modify the oracle, the interpreter, or both, to implement such restrictions. For example, we might add a field to the y86 state, or else add a "kernel" argument to the interpreter, to associate file descriptors with their statuses (e.g., associating 0 [stdin] with '(read), and associating [stdout] and stderr with, say '(append)). Or perhaps instead, the oracle might associate clocks with system call results obtained at those clocks, and a hypotheses could check that calls don't violate the access-after-open protocol.