Lex an escape sequence.
This is called when we expect an escape sequence, after reading the initial backslash.
We read the next character, and based on that we handle the different escape sequences. We return the position of the last character of the escape sequence.
If the next character is one for a simple escape, we return the simple escape.
If instead the next character is an octal digit, we read possibly another one and possibly yet another one, to see whether the octal escape sequence consists of one, two, or three octal digits.
If instead the next character starts a hexadecimal escape sequence, we proceed to read zero or more hexadecimal digits, as many as possible, and we check that there is at least one.
If instead the next character starts a universal character name, we read one or two quadruples of hexadecimal digits, based on the kind of escape sequence.
In all other cases, it is an error: although this starts like an escape sequence, it is not one.
Function:
(defun lex-escape-sequence (parstate) (declare (xargs :stobjs (parstate))) (declare (xargs :guard (parstatep parstate))) (let ((__function__ 'lex-escape-sequence)) (declare (ignorable __function__)) (b* (((reterr) (irr-escape) (irr-position) parstate) ((erp char pos parstate) (read-char parstate))) (cond ((not char) (reterr-msg :where (position-to-msg pos) :expected "a single quote ~ or a double quote ~ or a question mark ~ or a backslash ~ or a letter in {a, b, f, n, r, t, v, u, U, x} ~ or an octal digit" :found (char-to-msg char))) ((= char (char-code #\')) (retok (escape-simple (simple-escape-squote)) pos parstate)) ((= char (char-code #\")) (retok (escape-simple (simple-escape-dquote)) pos parstate)) ((= char (char-code #\?)) (retok (escape-simple (simple-escape-qmark)) pos parstate)) ((= char (char-code #\\)) (retok (escape-simple (simple-escape-bslash)) pos parstate)) ((= char (char-code #\a)) (retok (escape-simple (simple-escape-a)) pos parstate)) ((= char (char-code #\b)) (retok (escape-simple (simple-escape-b)) pos parstate)) ((= char (char-code #\f)) (retok (escape-simple (simple-escape-f)) pos parstate)) ((= char (char-code #\n)) (retok (escape-simple (simple-escape-n)) pos parstate)) ((= char (char-code #\r)) (retok (escape-simple (simple-escape-r)) pos parstate)) ((= char (char-code #\t)) (retok (escape-simple (simple-escape-t)) pos parstate)) ((= char (char-code #\v)) (retok (escape-simple (simple-escape-v)) pos parstate)) ((and (<= (char-code #\0) char) (<= char (char-code #\7))) (b* (((erp char2 pos2 parstate) (read-char parstate))) (cond ((and char2 (<= (char-code #\0) char2) (<= char2 (char-code #\7))) (b* (((erp char3 pos3 parstate) (read-char parstate))) (cond ((and char3 (<= (char-code #\0) char3) (<= char3 (char-code #\7))) (retok (escape-oct (oct-escape-three (code-char char) (code-char char2) (code-char char3))) pos3 parstate)) (t (b* ((parstate (if char3 (unread-char parstate) parstate))) (retok (escape-oct (oct-escape-two (code-char char) (code-char char2))) pos2 parstate)))))) (t (b* ((parstate (if char2 (unread-char parstate) parstate))) (retok (escape-oct (oct-escape-one (code-char char))) pos parstate)))))) ((= char (char-code #\x)) (b* (((erp hexdigs last-pos next-pos parstate) (lex-*-hexadecimal-digit pos parstate))) (if hexdigs (retok (escape-hex hexdigs) last-pos parstate) (reterr-msg :where (position-to-msg next-pos) :expected "one or more hexadecimal digits" :found "none")))) ((= char (char-code #\u)) (b* (((erp quad pos parstate) (lex-hex-quad parstate))) (retok (escape-univ (univ-char-name-locase-u quad)) pos parstate))) ((= char (char-code #\U)) (b* (((erp quad1 & parstate) (lex-hex-quad parstate)) ((erp quad2 pos parstate) (lex-hex-quad parstate))) (retok (escape-univ (make-univ-char-name-upcase-u :quad1 quad1 :quad2 quad2)) pos parstate))) (t (reterr-msg :where (position-to-msg pos) :expected "a single quote ~ or a double quote ~ or a question mark ~ or a backslash ~ or a letter in {a, b, f, n, r, t, v, u, U, x} ~ or an octal digit" :found (char-to-msg char)))))))
Theorem:
(defthm escapep-of-lex-escape-sequence.escape (b* (((mv acl2::?erp ?escape ?last-pos ?new-parstate) (lex-escape-sequence parstate))) (escapep escape)) :rule-classes :rewrite)
Theorem:
(defthm positionp-of-lex-escape-sequence.last-pos (b* (((mv acl2::?erp ?escape ?last-pos ?new-parstate) (lex-escape-sequence parstate))) (positionp last-pos)) :rule-classes :rewrite)
Theorem:
(defthm parstatep-of-lex-escape-sequence.new-parstate (implies (parstatep parstate) (b* (((mv acl2::?erp ?escape ?last-pos ?new-parstate) (lex-escape-sequence parstate))) (parstatep new-parstate))) :rule-classes :rewrite)
Theorem:
(defthm parsize-of-lex-escape-sequence-uncond (b* (((mv acl2::?erp ?escape ?last-pos ?new-parstate) (lex-escape-sequence parstate))) (<= (parsize new-parstate) (parsize parstate))) :rule-classes :linear)
Theorem:
(defthm parsize-of-lex-escape-sequence-cond (b* (((mv acl2::?erp ?escape ?last-pos ?new-parstate) (lex-escape-sequence parstate))) (implies (not erp) (<= (parsize new-parstate) (1- (parsize parstate))))) :rule-classes :linear)