Constructor-pattern-match-macros
How to write pattern-match macros for custom constructors.
Here we discuss how constructor pattern-match macros work in
conjunction with pattern-match. In most cases the user does not need to be
concerned with the internals discussed here; see def-pattern-match-constructor for an easy way to get pattern-match to
recognize a user-defined form.
The trick behind pattern-match is that whenever a constructor cname is
seen in a pattern, a call to the macro named cname-pattern-matcher is
returned and macro expansion continues by expanding that macro. Because of
this, all the unprocessed parts of the original pattern-match call must be
passed through that macro. By the design of the framework, the constructor
macro will only operate on a few of the arguments given to it, passing the rest
through to the main function that performs pattern matching,
pattern-bindings-list.
The arguments given to the constructor's macro are
(term args tests bindings lhses rhses pmstate)
The arguments that pattern-bindings-list takes are
(lhses rhses tests bindings pmstate)
The argument list of pattern-bindings-list is a subset of that of the
constructor's macro. We will discuss how to form the arguments for
pattern-bindings-list from those given to the constructor macro.
The constructor macro is responsible for error handling in the case of a
nonsensical invocation of the constructor (primarily, one with the wrong number
of arguments), adding appropriate tests to determine whether term can
match the pattern, and ``lining up'' the arguments given to the constructor in
the pattern with the appropriate destructors applied to the term in
question.
We will go through the arguments given to the macro and outline what needs
to be done with them to fulfill the above obligations.
term is a term which should evaluate to the current part of the input
that we are trying to match. If the original term given as input to pattern
match was x, then term may be something like (car (nth 4 (cdr x))).
Therefore we need to add tests to determine whether this is of the correct form
to be matched to something created by our constructor, and we need to apply the
correct destructors to it to break it down for further matching.
args is the list of arguments given to the constructor in the pattern
that we're matching to. The whole pattern that term is supporsed to match
is our constructor cname applied to args. For error checking we need
to ensure that args is the correct length for the call to our constructor
to make sense. It is also helpful to ensure that args is a true-list and
issue a helpful error message if not. Each element of args must also be
paired with an application of a destructor to term to continue pattern
matching. If, as is usually the case, the arguments we're expecting are to be
read as subpatterns, the best approach is not to examine them individually but
to let pattern-bindings-list do the real work.
tests is an accumulated list of tests to be applied to the input to
determine whether it matches the pattern. We need to prepend to this list any
necessary tests on term so as to determine whether it could be formed by
our constructor.
bindings is an accumulated list of variables that will be bound to
applications of destructors to the input term. While the results of the
processing that our macro does will have a direct effect on this list, most of
the time it should be passed through to pattern-bindings-list and we
should instead manipulate lhses and rhses:
lhses and rhses are lists of, respectively, subpatterns of the
top-level pattern that we're processing and corresponding subterms of the input
term that will be matched to the patterns. In most cases what we'll do is
prepent args to lhses while prepending a list of each of our
destructors applied to term to rhses. pattern-bindings-list
will then handle the details of variable bindings and recursive subpattern
matching as determined by the contents of lhses. Each macro must maintain
the invariant that lhses and rhses are the same length; if this isn't
the case there are probably other things going wrong as well. The intuition
behind these names is that eventually patterns in lhses break down to
variables, which are bound to corresponding subterms broken down from elements
of rhses. We're using LHS and RHS here as in an assignment statement in
some imperative language, as opposed to the sense used when talking about a
rewrite rule.
pmstate contains the expression to be evaluated if the pattern matches,
the list of tests to be tried before confirming a match, declarations, the rest
of the clauses to match to in case this match fails, and the name of the macro
to pass the final results to. These are grouped together specifically because
they don't have to do with the actual pattern-matching but must be kept intact
through the various iterations of macro expansion. This argument should
*always* be passed through intact to pattern-bindings-list unless you're trying
to really confuse your users.
An example of a very typical constructor macro is the one for cons, which is
automatically generated by def-pattern-match-constructor:
(defmacro
cons-pattern-matcher
(term args tests bindings lhses rhses pmstate)
(cond
;; First check args for well-formedness: it should always be a true-list of
;; length 2, since any other argument list to cons is ill-formed.
((not (true-listp args))
(er hard 'top-level ``badly formed expression: ~~x0~~%''
(cons 'cons args)))
((not (= (len args) 2))
(er hard 'top-level
``Wrong number of arguments to CONS in pattern matching: ~~x0~~%''
(CONS 'CONS ARGS)))
(t (let
;; Push destructor applications (car term) and (cdr term) onto rhses
((rhses (append (list (list 'car term) (list 'cdr term)) rhses))
;; Push the args onto lhses (they must occur in the order corresponding
;; to the order of the destructor calls pushed onto rhses.)
(lhses (append args lhses))
;; Push a test that term is consp onto tests
(tests (cons (list 'consp term) tests)))
;; Finally call pattern-bindings-list again.
(pattern-bindings-list lhses rhses tests bindings pmstate)))))
If there are no errors, this simply makes three changes to the existing
arguments: it prepends the two subterms (car term) and (cdr term)
onto rhses and the list of arguments to lhses and adds the test
(consp term) to tests. It then calls pattern-bindings-list.
The macro for list works the same way, but could not have been generated by
def-pattern-match-constructor because it handles variable length argument
lists. It again simply prepends all arguments to lhses, prepends a list
of applications of destructors to the input term to rhses (try evaluating
(list-of-nths 0 5 'x) to see the resulting form), and tests whether the
input term is of a suitable form, in this case whether it is a true-list of the
same length as the argument list.
(defmacro list-pattern-matcher
(term args tests bindings lhses rhses pmstate)
;; Ensure that args is a true-list; it may be any length.
(if (not (true-listp args))
(er hard 'top-level ``Badly formed expression: ~~x0~~%''
(cons 'list args))
(let
;; list-of-nths produces a list of calls to nth, from (nth 0 term) up
;; to (nth (- (length args) 1) term).
((rhses (append (list-of-nths 0 (length args) term) rhses))
;; push args onto lhses
(lhses (append args lhses))
;; Require that term is a true-list with length corresponding to that
;; of args.
(tests (append `((true-listp ,term)
(= (len ,term) ,(length args)))
tests)))
(pattern-bindings-list lhses rhses tests bindings pmstate))))
A nonstandard, but still correct, example is the one for list*, which
instead of doing the processing itself replaces its pattern with an equivalent
cons structure so that the cons macro will do all the work: to illustrate what
is prepended to lhses, try running (list*-macro (list 'a 'b 'c 'd)).
In this case no test needs to be added because the cons macro takes care of it.
Note that we could easily cause an infinite loop in macro expansion by abusing
this type of thing and, for example, pushing a new list* pattern onto
lhses.
(defmacro list*-pattern-matcher
(term args tests bindings lhses rhses pmstate)
;; Check that args is a true-listp.
;; If this is counterintuitive, consider that this would suggest syntax such
;; as (list* a b . c).
(if (not (true-listp args))
(er hard 'top-level ``Badly formed expression: ~~x0~~%''
(cons 'list* args))
(let
;; Just push term onto rhses
((rhses (cons term rhses))
;; list*-macro is the very function that list* uses to expand an
;; invocation into a nest of conses. Since we have a cons pattern
;; matcher already, we just take advantage of this.
(lhses (cons (list*-macro args) lhses)))
;; No additional tests are necessary - we trust in cons-pattern-matcher
;; to take care of that.
(pattern-bindings-list lhses rhses tests bindings pmstate))))
Another nonstandard example is raw-pattern-matcher, which reverts the
behavior of pattern-match to that of case-match for the term inside; in fact,
it just calls the function that does the work for case-match -
match-tests-and-bindings - and uses its results. In this case, since the
argument to our constructor is not taken to be a subpattern of the form handled
by pattern-bindings-list, we manipulate bindings directly rather than
dealing with lhses and rhses. It is fortunate that the form of the
tests and bindings variables for match-tests-and-bindings is the same as
ours or we would need to do more processing of them.
(defmacro raw-pattern-matcher
(term args tests bindings lhses rhses pmstate)
;; Args should be a list of length 1 - just a pattern.
(if (or (atom args)
(cdr args))
(er hard 'top-level ``Badly formed expression: ~~x0~~%''
(cons 'raw args))
;; match-tests-and-bindings takes a term, a case-match pattern, and a list
;; of tests and bindings; it returns a new version of tests and bindings
;; including the ones necessary to match the term to the pattern.
; Matt K. mod, 4/2024: Account for new dups argument and return value.
(mv-let (tests bindings dups)
(match-tests-and-bindings term (car args) tests bindings nil)
(declare (ignore dups))
;; We then pass the new tests and bindings to
;; pattern-bindings-list.
(pattern-bindings-list lhses rhses tests bindings pmstate))))
Also try looking at the definitions for bind-pattern-matcher,
any-pattern-matcher, and both force-pattern-matcher and
force-match-remove-tests-pattern-matcher as further nonstandard
examples.