Grammar-parser-constraints-from-tree-matching
Tree maching constraint theorems for the parser of ABNF grammars.
If a (list of) terminated tree(s) matches a syntactic entity,
the string at the leaves of the tree(s) must satisfy certain constraints.
For example, if a terminated tree matches CRLF,
the string at the leaves of the tree
must be non-empty and start with a carriage return.
The tree matching constraint theorems below
capture constraints of this kind.
While these theorems are not directly related to the parser,
they are motivated by the parser (see below).
Usage
These theorems are used, together with the parsing constraint theorems, to prove the disambiguation theorems, which in turn are used to prove the completeness theorems.
The tree matching constraint theorems have no rule classes.
They are used in the proofs of the disambiguation theorems
via :use hints.
Some tree matching constraint theorems are used
to incrementally prove other tree matching constraint theorems (see below),
also via :use hints.
Scope
There are tree matching constraint theorems
only for some syntactic entities:
just the ones used to prove the disambiguation theorems,
and to incrementally prove other tree matching constraint theorems.
Furthermore, each tree matching constraint theorem only states necessary,
but generally not sufficient,
conditions for the matching of the corresponding syntactic entity:
it states only the constraints used to prove the disambiguation theorems
(and, incrementally, other tree matching constraint theorems).
Most tree matching constraint theorems state constraints
just on the first natural number of the string at the leaves of the tree(s)
(the car),
because most of the grammar is LL(1).
A few tree matching constraint theorems state additional constraints
on the second natural number of the string at the leaves of the tree(s)
(the cadr),
as needed for the LL(2) portions of the grammar.
Since the tree matching constraint theorem
constraints-from-tree-match-ichars
would have to state constraints
either on the first natural number
or on the first and second natural numbers
depending on the instantiation of charstring,
for simplicity this theorem states constraints
on the whole string at the leaves of the tree.
Hypotheses on the Tree(s)
Most tree matching constraint theorems include
hypotheses saying that the trees are terminated.
This ensures that the strings at the leaves of the trees
consist of natural numbers and not rule names,
since the constraints are stated on natural numbers.
A few tree matching constraint theorems do not need these hypotheses
because the corresponding syntactic entities can only be matched
by trees whose (starting) leaves are natural numbers.
For instance, in constraints-from-tree-match-dot-etc,
the group ("." ...) can only be matched
by a tree whose first leaf is a natural number,
upon which the theorem states the constraint.
Hypotheses on the String at the Leaves of the Tree(s)
The tree matching constraint theorems
whose names end with -when-nonempty
include hypotheses saying that the string at the leaves is not empty.
This is because the corresponding syntactic entities may be matched
by a tree with an empty string at the leaves
or by an empty list of trees.
For example,
in constraints-from-tree-match-bin-val-rest-when-nonempty,
the matching tree may have no subtrees,
since the syntactic entity in question
is an option [ 1*("." 1*BIT) / ("-" 1*BIT) ].
As another example,
in constraints-from-tree-list-match-*digit-when-nonempty,
the matching list of trees may be empty.
Proof Methods
The proof of each tree matching constraint theorem
expands tree-match-element-p
or tree-list-match-repetition-p
(depending on whether the theorem applies to
a single tree or to a list of trees).
But most theorems need explicit :expand hints for that:
just enabling the functions does not suffice
(presumably due to ACL2's heuristics for expanding recursive functions).
Since many repetitions consist of one element,
the rewrite rule tree-list-match-repetition-p-of-1-repetition
is used in many proofs.
It is enabled just before the tree matching constraint theorems
and disabled just after.
Except for
constraints-from-tree-match-dot-etc and
constraints-from-tree-match-dash-etc,
the alternations and concatenations of the syntactic entities being matched
have an explicit list structure,
and so the proof automatically uses rewrite rules like
tree-list-list-match-alternation-p-of-cons-alternation.
In contrast,
for constraints-from-tree-match-dot-etc and
for constraints-from-tree-match-dash-etc,
we expand
tree-list-list-match-alternation-p and
tree-list-list-match-concatenation-p
via explicit :expand hints,
because just enabling these two functions does not suffice
(presumably due to ACL2's heuristics for expanding recursive functions).
The expansions described above reduce
the matching of the (list of) tree(s) with syntactic entities
in the hypothesis,
to the matching of subtrees with syntactic sub-entities.
Tree matching constraint theorems for these subtrees
are used in the proofs for the containing trees, via :use hints.
For example, the proof of constraints-from-tree-match-alpha
uses constraints-from-tree-match-ichars twice,
once for each alternative of ALPHA that the subtree may match.
As the matching of the tree(s) in the hypotheses of the theorems
is reduced, in the proofs, to the matching of subtrees as just explained,
we also need
to reduce the tree strings to subtree strings and
to reduce the terminated tree hypotheses to terminated subtree hypotheses,
so that the tree matching constraint theorems for the subtrees apply.
We accomplish the latter reductions by expanding the definitions of
tree->string,
tree-list->string,
tree-list-list->string, and
tree-terminatedp.
We enable them just before the tree matching constraint theorems
and we disable them just after.
Enabled rules for
tree-list-terminatedp and tree-list-list-terminatedp
suffice, so we do not need to enable these two functions.
Since constraints-from-tree-match-ichars
states constraints with nats-match-insensitive-chars-p,
and since that theorem is used in the proofs
of several other tree matching constraint theorems,
the proofs of the latter theorems
use the definition of nat-match-insensitive-char-p,
which we enable just before the tree matching constraint theorems
and we disable just after.
The rewrite rules
nats-match-sensitive-chars-p-when-atom-chars and
nats-match-sensitive-chars-p-when-cons-chars
reduce nats-match-insensitive-chars-p
to nat-match-insensitive-char-p
because the chars argument is constant.
Subtopics
- Constraints-from-tree-match-?repeat-when-nonempty
- Constraints induced by a terminated tree
that matches [repeat]
and that has a non-empty string at the leaves.
- Constraints-from-tree-match-repetition
- Constraints induced by a terminated tree that matches
repetition.
- Constraints-from-tree-match-hex-val-rest-when-nonempty
- Constraints induced by a tree
that matches [ 1*("." 1*HEXDIG) / ("-" 1*HEXDIG) ]
and that has a non-empty string at the leaves.
- Constraints-from-tree-match-elements
- Constraints induced by a terminated tree that matches elements.
- Constraints-from-tree-match-dot-etc
- Constraints induced by a tree that matches a group ("." ...).
- Constraints-from-tree-match-dec-val-rest-when-nonempty
- Constraints induced by a tree
that matches [ 1*("." 1*DIGIT) / ("-" 1*DIGIT) ]
and that has a non-empty string at the leaves.
- Constraints-from-tree-match-dash-etc
- Constraints induced by a tree that matches a group ("-" ...).
- Constraints-from-tree-match-concatenation
- Constraints induced by a terminated tree that matches
concatenation.
- Constraints-from-tree-match-bin-val-rest-when-nonempty
- Constraints induced by a tree
that matches [ 1*("." 1*BIT) / ("-" 1*BIT) ]
and that has a non-empty string at the leaves.
- Constraints-from-tree-match-alternation
- Constraints induced by a terminated tree that matches
alternation.
- Constraints-from-tree-match-?-%i-when-nonempty
- Constraints induced by a tree
that matches [ "%i" ]
and that has a non-empty string at the leaves.
- Constraints-from-tree-match-ichars
- Constraints induced by a tree that matches
a case-insensitive character value notation.
- Constraints-from-tree-match-case-insensitive-string
- Constraints induced by a terminated tree that matches
case-insensitive-string.
- Constraints-from-tree-match-alpha
- Constraints induced by a terminated tree that matches ALPHA.
- Constraints-from-tree-list-match-*digit-when-nonempty
- Constraints induced by a non-empty list of terminated trees
that matches *DIGIT.
- Constraints-from-tree-match-*digit-star-*digit
- Constraints induced by a terminated tree that matches
(*DIGIT "*" *DIGIT).
- Constraints-from-tree-match-rulename
- Constraints induced by a terminated tree that matches rulename.
- Constraints-from-tree-match-rule
- Constraints induced by a terminated tree that matches rule.
- Constraints-from-tree-match-in-range
- Constraints induced by a tree that matches
a range numeric value notation.
- Constraints-from-tree-match-element
- Constraints induced by a terminated tree that matches element.
- Constraints-from-tree-match-case-sensitive-string
- Constraints induced by a terminated tree that matches
case-sensitive-string.
- Constraints-from-tree-match-bin/dec/hex-val
- Constraints induced by a terminated tree that matches
(bin-val / dec-val / hex-val).
- Constraints-from-tree-list-match-*cwsp-when-nonempty
- Constraints induced by a non-empty list of terminated trees
that matches *c-wsp.
- Constraints-from-tree-match-vchar
- Constraints induced by a terminated tree that matches VCHAR.
- Constraints-from-tree-match-repeat
- Constraints induced by a terminated tree that matches repeat.
- Constraints-from-tree-match-quoted-string
- Constraints induced by a terminated tree that matches
quoted-string.
- Constraints-from-tree-match-num-val
- Constraints induced by a terminated tree that matches num-val.
- Constraints-from-tree-match-exact
- Constraints induced by a tree that matches
a direct numeric value notation.
- Constraints-from-tree-match-equal-/-equal-slash
- Constraints induced by a terminated tree that matches
("=" / "=/").
- Constraints-from-tree-match-digit
- Constraints induced by a terminated tree that matches DIGIT.
- Constraints-from-tree-match-defined-as
- Constraints induced by a terminated tree that matches
defined-as.
- Constraints-from-tree-match-cnl-wsp
- Constraints induced by a terminated tree that matches
(c-nl WSP).
- Constraints-from-tree-list-match-1*digit
- Constraints induced by a list of terminated trees that matches
1*DIGIT.
- Constraints-from-tree-match-wsp
- Constraints induced by a terminated tree that matches WSP.
- Constraints-from-tree-match-sp
- Constraints induced by a terminated tree that matches SP.
- Constraints-from-tree-match-prose-val
- Constraints induced by a terminated tree that matches prose-val.
- Constraints-from-tree-match-option
- Constraints induced by a terminated tree that matches option.
- Constraints-from-tree-match-htab
- Constraints induced by a terminated tree that matches HTAB.
- Constraints-from-tree-match-hex-val
- Constraints induced by a terminated tree that matches hex-val.
- Constraints-from-tree-match-group
- Constraints induced by a terminated tree that matches group.
- Constraints-from-tree-match-dquote
- Constraints induced by a terminated tree that matches DQUOTE.
- Constraints-from-tree-match-dec-val
- Constraints induced by a terminated tree that matches dec-val.
- Constraints-from-tree-match-cwsp
- Constraints induced by a terminated tree that matches c-wsp.
- Constraints-from-tree-match-crlf
- Constraints induced by a terminated tree that matches CRLF.
- Constraints-from-tree-match-cr
- Constraints induced by a terminated tree that matches CR.
- Constraints-from-tree-match-comment
- Constraints induced by a terminated tree that matches comment.
- Constraints-from-tree-match-cnl
- Constraints induced by a terminated tree that matches c-nl.
- Constraints-from-tree-match-char-val
- Constraints induced by a terminated tree that matches char-val.
- Constraints-from-tree-match-bin-val
- Constraints induced by a terminated tree that matches bin-val.