Lexes the UTF-8 bytes into a list of lexemes.
(lexemize-pfcs-from-bytes pfcs-bytes) → pfcs-lexemes
A lexeme is a token or whitespace.
If the input cannot be fully lexed, a reserrp is returned.
Function:
(defun lexemize-pfcs-from-bytes (pfcs-bytes) (declare (xargs :guard (nat-listp pfcs-bytes))) (let ((__function__ 'lexemize-pfcs-from-bytes)) (declare (ignorable __function__)) (b* (((unless (unsigned-byte-listp 8 pfcs-bytes)) (reserrf (cons :invalid-octets pfcs-bytes))) (codepoints (acl2::utf8=>ustring pfcs-bytes)) ((unless (nat-listp codepoints)) (reserrf (cons :invalid-utf-8 pfcs-bytes)))) (lexemize-pfcs codepoints))))
Theorem:
(defthm tree-list-resultp-of-lexemize-pfcs-from-bytes (b* ((pfcs-lexemes (lexemize-pfcs-from-bytes pfcs-bytes))) (abnf::tree-list-resultp pfcs-lexemes)) :rule-classes :rewrite)