Translate an effective address to a canonical linear address.
This translation is illustrated in Intel manual, Mar'17, Vol. 3A, Fig. 3-5, as well in AMD mamual, Oct'2013, Vol. 1, Fig. 2-3 and 2-4. In addition to the effective address, this function takes as input (the index of) a segment register, whose corresponding segment selector, with the effective address, forms the logical address that is turned into the linear address.
This translation is used:
when fetching instructions,
in which case the effective address is in RIP, EIP, or IP;
when accessing the stack implicitly,
in which case the effective address is in RSP, ESP, or SP;
and when accessing data explicitly,
in which case the effective address is calculated by instructions
via x86-effective-addr.
In the formal model,
RIP contains a signed 48-bit integer,
RSP contains a signed 64-bit integer,
and x86-effective-addr returns a signed 64-bit integer:
thus, the guard of this function requires
This function also takes as input the number of bytes
that have to be read or written starting from the effective address;
that is, the chunk of bytes to be accessed
goes from
You also can't ``wrap around'' the end of segment like before. Earlier x86 chips with their 64KB segments allowed pointers to roll over from FFFF to 0000 automatically. Programs that accidentally (or purposely) ran off the end of a segment started over at the beginning. Programs that run off the end of a segment now are greeted with a segmentation fault.So for now we make the stricter check by calculating
In 32-bit mode,
the effective address is added to the base address of the segment;
the result is truncated to 32 bits,
because the addition should be modulo 2^32,
given that the result must be 32 bits
(cf. aforementioned figures in Intel and AMD manuals).
In 64-bit mode,
the addition of the base address of the segment is performed
only if the segments are in registers FS and GS;
the result is truncated to 64 bits,
because the addition should be modulo 2^64,
given that the result must be 32 bits
(cf. aforementioned figures in Intel and AMD manuals);
since in our model addresses are signed,
we use
If the translation is successful,
this function returns a signed 64-bit integer
that represents a canonical linear address.
In 64-bit mode, when the segment register is not FS or GS, the effective
address is returned unmodified as a linear address, because segment
translation should be a no-op in this case; otherwise, the effective address
is translated to a linear address. In both cases, this linear address is
checked to be canonical; if not, an error flag is returned, otherwise, a
canonical linear address is returned.
In 32-bit mode, the 32-bit linear address that results from the translation
is always canonical.
If the translation fails,
including the check that the linear address is canonical,
a non-
As explained in Intel manual, May'18, Volume 3, Sections 3.4.2 and 5.4.1, a null segment selector can be loaded into DS, ES, FS, and GS, but then a memory access through these segment registers causes a #GP. According to AMD manual, Dec'17, Volume 2, Section 4.5.1, a null segment selector has SI = TI = 0, but no explicit constraint is stated on RPL; Intel manual, May'18, Volume 2, POP specification says that a null segment selector has a value from 0 to 3, from which we infer that a null segment selector may have a non-zero RPL. In this function, we return an error if the visible portion of the segment register is 0-3, and the segment register is not CS or SS. Loading a null segment selector into CS and SS is not allowed, so it is a state invariant that CS and SS do not contain null segment selectors. The null segment selector check is skipped in 64-bit mode (see Intel manual, May'18, Volume 3, Section 5.4.1.1).
Function:
(defun ea-to-la$inline (proc-mode eff-addr seg-reg nbytes x86) (declare (xargs :stobjs (x86))) (declare (type (integer 0 4) proc-mode) (type (signed-byte 64) eff-addr) (type (integer 0 5) seg-reg) (type (member 1 2 4 6 8 10 16 32 64) nbytes)) (case proc-mode (0 (b* ((lin-addr (if (or (eql seg-reg 4) (eql seg-reg 5)) (b* (((mv base & &) (segment-base-and-bounds 0 seg-reg x86)) (lin-addr (i64 (+ base (n64 eff-addr))))) lin-addr) eff-addr)) ((unless (canonical-address-p lin-addr)) (mv (list :non-canonical-address lin-addr) 0))) (mv nil lin-addr))) (1 (b* (((when (and (not (= seg-reg *cs*)) (not (= seg-reg *ss*)) (< (seg-visiblei seg-reg x86) 4))) (mv (list :null-segment-selector seg-reg) 0)) ((mv (the (unsigned-byte 32) base) (the (unsigned-byte 33) lower-bound) (the (unsigned-byte 32) upper-bound)) (segment-base-and-bounds proc-mode seg-reg x86)) (first-addr eff-addr) (last-addr (+ eff-addr nbytes -1)) ((unless (and (<= lower-bound first-addr) (<= last-addr upper-bound))) (mv (list :segment-limit-fail (list seg-reg eff-addr lower-bound upper-bound)) 0)) ((the (unsigned-byte 32) lin-addr) (n32 (+ (the (unsigned-byte 32) base) (the (unsigned-byte 32) eff-addr))))) (mv nil lin-addr))) (otherwise (mv (list :unimplemented-proc-mode proc-mode) 0))))
Theorem:
(defthm i64p-of-ea-to-la.lin-addr (implies (i64p eff-addr) (b* (((mv ?flg ?lin-addr) (ea-to-la$inline proc-mode eff-addr seg-reg nbytes x86))) (i64p lin-addr))) :rule-classes :rewrite)
Theorem:
(defthm ea-to-la-is-i64p (implies (i64p eff-addr) (signed-byte-p 64 (mv-nth 1 (ea-to-la proc-mode eff-addr seg-reg nbytes x86)))) :rule-classes (:rewrite (:type-prescription :corollary (implies (i64p eff-addr) (integerp (mv-nth 1 (ea-to-la proc-mode eff-addr seg-reg nbytes x86)))) :hints (("Goal" :in-theory '(signed-byte-p integer-range-p)))) (:linear :corollary (implies (i64p eff-addr) (and (<= -9223372036854775808 (mv-nth 1 (ea-to-la proc-mode eff-addr seg-reg nbytes x86))) (< (mv-nth 1 (ea-to-la proc-mode eff-addr seg-reg nbytes x86)) 9223372036854775808))) :hints (("Goal" :in-theory '(signed-byte-p integer-range-p (:e expt)))))))
Theorem:
(defthm ea-to-la-is-i48p-when-no-error (implies (not (mv-nth 0 (ea-to-la proc-mode eff-addr seg-reg nbytes x86))) (signed-byte-p 48 (mv-nth 1 (ea-to-la proc-mode eff-addr seg-reg nbytes x86)))) :rule-classes (:rewrite (:type-prescription :corollary (implies (not (mv-nth 0 (ea-to-la proc-mode eff-addr seg-reg nbytes x86))) (integerp (mv-nth 1 (ea-to-la proc-mode eff-addr seg-reg nbytes x86)))) :hints (("Goal" :in-theory '(signed-byte-p integer-range-p)))) (:linear :corollary (implies (not (mv-nth 0 (ea-to-la proc-mode eff-addr seg-reg nbytes x86))) (and (<= -140737488355328 (mv-nth 1 (ea-to-la proc-mode eff-addr seg-reg nbytes x86))) (< (mv-nth 1 (ea-to-la proc-mode eff-addr seg-reg nbytes x86)) 140737488355328))) :hints (("Goal" :in-theory '(signed-byte-p integer-range-p (:e expt)))))))
Theorem:
(defthm ea-to-la-of-xw (implies (and (not (equal fld :msr)) (not (equal fld :seg-hidden-base)) (not (equal fld :seg-hidden-limit)) (not (equal fld :seg-hidden-attr)) (not (equal fld :seg-visible))) (equal (ea-to-la proc-mode eff-addr seg-reg nbytes (xw fld index value x86)) (ea-to-la proc-mode eff-addr seg-reg nbytes x86))))
Theorem:
(defthm ea-to-la-when-64-bit-modep-and-not-fs/gs (implies (and (not (equal seg-reg 4)) (not (equal seg-reg 5))) (equal (ea-to-la 0 eff-addr seg-reg nbytes x86) (if (canonical-address-p eff-addr) (mv nil eff-addr) (mv (list :non-canonical-address eff-addr) 0)))))