Check that the numerical constant just read is a full preprocessing number.
(check-full-ppnumber ends-in-e parstate) → (mv erp new-parstate)
In C, integer and floating constants are not lexed ``directly''
according to their grammar rules in [C].
First, preprocessing tokens must be recognized,
defined by preprocessing-token in [C:6.4] [C:A.1.1].
These include preprocessing numbers,
defined by pp-number in [C:6.4.8] [C:A.1.9],
which start with a digit, optionally preceded by a dot,
and are followed by identifier characters (including digits and letter),
as well as plus and minus signs immediately preceded by exponent letters,
as well as periods
[C:6.4.8/2].
Thus, preprocessing numbers lexically include
all integer and floating constants [C:6.4.8/3],
and much more, e.g.
As part of translation phase 7 [C:5.1.1.2], preprocessing tokens are converted to tokens. This includes converting preprocessing numbers to integer and floating constants, checking that they are in fact integer or floating constants, e.g. the example above would not pass the checks.
In our lexer, we lex integer and floating constants directly, but we need to ensure that the behavior is the same as if we had gone through preprocessing numbers. We do that as follow: after fully recognizing an integer or floating constant, we check whether there is a subsequent character of the kind that would be part of a preprocessing number: if there is, it is an error, because the preprocessing number cannot be converted to a constant, due to the extra character(s).
This function performs this check. It is called after an integer or floating constant has been recognized completely during lexing. This function attempts to read the next character, and unless there is no next character, or the next character is one that would not be part of the preprocessing number, an error is returned. In case of success, there is no additional result (it is just a check), except for the possibly updated parser state.
If the next character exists and is
a letter or a digit or an underscore or a dot,
it would be part of the preprocessing number,
so we return an error.
Otherwise, the check succeeds, except in one case.
The case is that the next character is
Note that, because of the rules on preprocessing numbers, in C this code is syntactically illegal
int x = 0xe+1; // illegal
whereas this code is syntactically legal
int x = 0xf+1; // legal
The reason is that
Function:
(defun check-full-ppnumber (ends-in-e parstate) (declare (xargs :stobjs (parstate))) (declare (xargs :guard (and (booleanp ends-in-e) (parstatep parstate)))) (let ((__function__ 'check-full-ppnumber)) (declare (ignorable __function__)) (b* (((reterr) parstate) ((erp char pos parstate) (read-char parstate)) ((when (not char)) (retok parstate)) ((when (or (and (<= (char-code #\A) char) (<= char (char-code #\Z))) (and (<= (char-code #\a) char) (<= char (char-code #\a))) (and (<= (char-code #\0) char) (<= char (char-code #\9))) (= char (char-code #\_)) (= char (char-code #\.)) (and ends-in-e (or (= char (char-code #\+)) (= char (char-code #\-)))))) (reterr-msg :where (position-to-msg pos) :expected (msg "a character other than ~ a letter ~ or a digit ~ or an underscore ~ or a dot~s0" (if ends-in-e " or a plus or a minus" "")) :found (char-to-msg char))) (parstate (unread-char parstate))) (retok parstate))))
Theorem:
(defthm parstatep-of-check-full-ppnumber.new-parstate (implies (parstatep parstate) (b* (((mv acl2::?erp ?new-parstate) (check-full-ppnumber ends-in-e parstate))) (parstatep new-parstate))) :rule-classes :rewrite)
Theorem:
(defthm parsize-of-check-full-ppnumber-uncond (b* (((mv acl2::?erp ?new-parstate) (check-full-ppnumber ends-in-e parstate))) (<= (parsize new-parstate) (parsize parstate))) :rule-classes :linear)