Lex an escape sequence.
This is called when we expect an escape sequence, after reading the initial backslash.
We read the next character, and based on that we handle the different escape sequences. We return the position of the last character of the escape sequence.
If the next character is one for a simple escape, we return the simple escape.
If instead the next character is an octal digit, we read possibly another one and possibly yet another one, to see whether the octal escape sequence consists of one, two, or three octal digits.
If instead the next character starts a hexadecimal escape sequence, we proceed to read zero or more hexadecimal digits, as many as possible, and we check that there is at least one.
If instead the next character starts a universal character name, we read one or two quadruples of hexadecimal digits, based on the kind of escape sequence.
In all other cases, it is an error: although this starts like an escape sequence, it is not one.
Function:
(defun lex-escape-sequence (pstate) (declare (xargs :guard (parstatep pstate))) (let ((__function__ 'lex-escape-sequence)) (declare (ignorable __function__)) (b* (((reterr) (irr-escape) (irr-position) (irr-parstate)) ((erp char pos pstate) (read-char pstate))) (cond ((not char) (reterr-msg :where (position-to-msg pos) :expected "a single quote ~ or a double quote ~ or a question mark ~ or a backslash ~ or a letter in {a, b, f, n, r, t, v, u, U, x} ~ or an octal digit" :found (char-to-msg char))) ((= char (char-code #\')) (retok (escape-simple (simple-escape-squote)) pos pstate)) ((= char (char-code #\")) (retok (escape-simple (simple-escape-dquote)) pos pstate)) ((= char (char-code #\?)) (retok (escape-simple (simple-escape-qmark)) pos pstate)) ((= char (char-code #\\)) (retok (escape-simple (simple-escape-bslash)) pos pstate)) ((= char (char-code #\a)) (retok (escape-simple (simple-escape-a)) pos pstate)) ((= char (char-code #\b)) (retok (escape-simple (simple-escape-b)) pos pstate)) ((= char (char-code #\f)) (retok (escape-simple (simple-escape-f)) pos pstate)) ((= char (char-code #\n)) (retok (escape-simple (simple-escape-n)) pos pstate)) ((= char (char-code #\r)) (retok (escape-simple (simple-escape-r)) pos pstate)) ((= char (char-code #\t)) (retok (escape-simple (simple-escape-t)) pos pstate)) ((= char (char-code #\v)) (retok (escape-simple (simple-escape-v)) pos pstate)) ((and (<= (char-code #\0) char) (<= char (char-code #\7))) (b* (((erp char2 pos2 pstate) (read-char pstate))) (cond ((and char2 (<= (char-code #\0) char2) (<= char2 (char-code #\7))) (b* (((erp char3 pos3 pstate) (read-char pstate))) (cond ((and char3 (<= (char-code #\0) char3) (<= char3 (char-code #\7))) (retok (escape-oct (oct-escape-three (code-char char) (code-char char2) (code-char char3))) pos3 pstate)) (t (b* ((pstate (if char3 (unread-char pstate) pstate))) (retok (escape-oct (oct-escape-two (code-char char) (code-char char2))) pos2 pstate)))))) (t (b* ((pstate (if char2 (unread-char pstate) pstate))) (retok (escape-oct (oct-escape-one (code-char char))) pos pstate)))))) ((= char (char-code #\x)) (b* (((erp hexdigs last-pos next-pos pstate) (lex-*-hexadecimal-digit pos pstate))) (if hexdigs (retok (escape-hex hexdigs) last-pos pstate) (reterr-msg :where (position-to-msg next-pos) :expected "one or more hexadecimal digits" :found "none")))) ((= char (char-code #\u)) (b* (((erp quad pos pstate) (lex-hex-quad pstate))) (retok (escape-univ (univ-char-name-locase-u quad)) pos pstate))) ((= char (char-code #\U)) (b* (((erp quad1 & pstate) (lex-hex-quad pstate)) ((erp quad2 pos pstate) (lex-hex-quad pstate))) (retok (escape-univ (make-univ-char-name-upcase-u :quad1 quad1 :quad2 quad2)) pos pstate))) (t (reterr-msg :where (position-to-msg pos) :expected "a single quote ~ or a double quote ~ or a question mark ~ or a backslash ~ or a letter in {a, b, f, n, r, t, v, u, U, x} ~ or an octal digit" :found (char-to-msg char)))))))
Theorem:
(defthm escapep-of-lex-escape-sequence.escape (b* (((mv acl2::?erp ?escape ?last-pos ?new-pstate) (lex-escape-sequence pstate))) (escapep escape)) :rule-classes :rewrite)
Theorem:
(defthm positionp-of-lex-escape-sequence.last-pos (b* (((mv acl2::?erp ?escape ?last-pos ?new-pstate) (lex-escape-sequence pstate))) (positionp last-pos)) :rule-classes :rewrite)
Theorem:
(defthm parstatep-of-lex-escape-sequence.new-pstate (b* (((mv acl2::?erp ?escape ?last-pos ?new-pstate) (lex-escape-sequence pstate))) (parstatep new-pstate)) :rule-classes :rewrite)
Theorem:
(defthm parsize-of-lex-escape-sequence-uncond (b* (((mv acl2::?erp ?escape ?last-pos ?new-pstate) (lex-escape-sequence pstate))) (<= (parsize new-pstate) (parsize pstate))) :rule-classes :linear)
Theorem:
(defthm parsize-of-lex-escape-sequence-cond (b* (((mv acl2::?erp ?escape ?last-pos ?new-pstate) (lex-escape-sequence pstate))) (implies (not erp) (<= (parsize new-pstate) (1- (parsize pstate))))) :rule-classes :linear)