Search-engine friendly clone of the ACL2 documentation.

Expr

Fixtype of expressions [C17:6.5] [C17:A.2.1].

This is a tagged union type, introduced by fty::deftagsum.

This corresponds to expression in the grammar in [C17].

Given that the abstract syntax is tree-structured, we do not explicitly introduce the various kinds of binary expressions defined in the grammar in [C17], and instead use a single kind of binary expression consisting of two sub-expressions and a binary operator. Furthermore, we always use the general fixtype for expressions as components of other constructs where the grammar in [C17] uses more specific kinds of expressions, like assignment-expression in generic-selection. This means that our fixtypes are a bit more general, but we can use separate predicates to enforce restrictions.

Some kinds of expressions may include some additional information (e.g. identifiers), such as types calculated during validation. This is an instance of the additional information discussed in abstract-syntax.

In order to capture possibly redundant parenthesization from the concrete syntax, we include, in this fixtype, a case :paren for an explicitly parenthesized expression.

There is a syntactic overlap between identifier expressions (variables) and (enumeration) constants. This ambiguity cannot be resolved purely syntactically. During parsing, we classify those as identifier expressions, and defer a possible re-classification to enumeration constant during a post-parsing static semantic analysis.

Instead of a single string literal, we allow a list of them, which should be non-empty, although we do not capture this constraint. This mirrors the ABNF grammar; we preserve the information about adjacent string literals, as opposed to concatenating them into one.

The :sizeof case of this fixtype captures sizeof applied to a type name. The sizeof applied to an expression is instead captured in the :unary case, since unop includes sizeof for expressions. As explained in amb-expr/tyname, there is a complex syntactic overlap between expressions and type names; thus, an expression of the form sizeof(X), where X is in that syntactic overlap, is inherently ambiguous. (The simplest case is when X is an identifier, but as explained in amb-expr/tyname there are infinitely many cases.) This is captured by the :sizeof-ambig case, which contains an amb-expr/tyname.

The :alignof case of this fixtype includes an indication of the underscore variant. Note that the variant without underscores represents the standard _Alignof, not the non-existing alignof, while the other two represent __alignof and __alignof__; see the ABNF grammar. Presumably, _Alignof was added to the grammar after __alignof and __alignof__ were GCC extensions.

We use different cases, :member and :memberp for the . and -> operators.

For compound literals, we also capture the presence or absence of the final comma just after the initializer-list. We formalize initializer-list [C17:6.7.9] [C17:A.2.2] as a list (which should be non-empty, unless GCC extensions are enabled) of pairs each consisting of some designators and an initializer (see desiniter.

The comma sequentialization operator is modeled as its own case in this fixtype. An alternative is to include that into binop. Another alternative is to model it as taking a list of expressions (it associates to the left). But for now the current model is adequate.

The last five kinds of expressions capture syntactically ambiguous expressions of the forms

( X ) IncDec ( E ) Pr
( X ) IncDec * E
( X ) IncDec + E
( X ) IncDec - E
( X ) IncDec & E

where X is an ambiguous expression or type name, IncDec is a sequence of zero or more increment and decrement operators ++ and --, E is an expression, and Pr is a possibly empty rest of a postfix expression. All of this is now explained in detail.

If X is an ambiguous expression or type name (i.e. something captured by amb-expr/tyname), then (X) could either start a proper cast expression (if X is a type name) or be or start an expression that is or starts with a parenthesized primary expression (if X is an expression). Note that (X) could also start a compound literal, but in that case we would be able to disambiguate X to be a type name and not an expression, because an expression (X) cannot be followed by an open curly brace. So if (X) is not followed by an open curly brace, there are a number of other tokens that may follow: some would let us disambiguate X to be either a type name or an expression; but in other cases it is not possible to disambiguate X, and so, similarly to :sizeof-ambig, we capture the ambiguous constructs explicitly in our abstract syntax. But for these ambiguous casts, the situation is more complex.

In the five patterns shown above, it is easier to ignore the IncDec part at first, pretending it is not there. In the first pattern, the Pr is either empty or one or more of the constructs that may be cascaded in postfix expressions. For instance, Pr could have the form [3].mem(ab)++, which consists of an array access [3], a member access .mem, a function call (ab) where ab is the argument, and a postincrement operator ++. In this situation, if X is an expression then (X)(E) is a function call that precedes the array access; but if X is a type name, then (E) is the start of a postfix expression. Note that in general an expression E could be a comma-separated sequence of (assignment) expressions, but that still looks like a function call, with multiple arguments: see the isomorphism between the grammar rules for argument-expression-list and expression in [C17]. The two situations cannot be disambiguated purely syntactically. So the :cast/call-ambig case of this fixtype captures this ambiguous situation: it is either a cast to X or a call of X, where the X component (an amb-expr/tyname) is either a type (name) or a function, and where the (E)Pr component (an expression) is either the argument of the cast or the rest of the postfix expression. Currently this fixtype does not capture the requirement that the expression has in fact the form (E)Pr, but this should be an invariant after parsing, and it can be enforced externally to this fixtype.

The addition of IncDec maintains the ambiguity. If X is a type name, the sequence of operators are pre-increment/decrement ones that are part of the argument of the cast, and whose final argument is (E)Pr. If instead X is an expression, we have a postfix expression starting with X, continuing with those as post-increment/decrement operators, and ending with (E)Pr. So the :cast/call-ambig case of this fixtype includes, between the components for X and (E)Pr, a list of zero or more increment and decrement operators.

The other four of the five patterns shown earlier are more uniform with each other. Again, ignore IncDec initially. The issue here is that the operators *, +, -, & are both unary and binary. Thus, if X is a type name, the * E or + E or - E or & E is a unary expression that is the argument of the cast; the operator is unary. If instead X is an expression, the operator is binary with operands (X) and E. The cases :cast/mul-ambig, :cast/add-ambig, :cast/sub-ambig, and :cast/and-ambig of this fixtype capture these ambiguous situations. They are either casts or multiplications/additions/subtractions/conjunctions. Their first component is X, and their last component is E. Their middle component is a list of zero or more increment and decrement operators that may be in between, i.e. IncDec in the patterns shown earlier. Their presence maintain the ambiguity: if X is a type name, they are pre-increment and pre-decrement operators applied to * E or + E or - E or & E; if X is an expression, they are post-increment and post-decrement operators applied to X, forming the left operand of the binary operators.

These should capture all the possible ambiguous cases. One needs to look, in the grammar, at what can follow the (X), where X is an ambiguous type name or expression. The cases explained above lead to unresolvable syntactic ambiguities (which can be resolved semantically, of course). Other cases lead to disambiguation. For instance, if (X) is followed by !, then X must be a type name, and the ! must start a unary expression that is the argument of the cast expression: the ! cannot follow an expression, if X were an expression instead. But it should be apparent that this is all very tricky; we plan to work on a formal proof showing that indeed the last five cases of this fixtype captures all and only the ambiguous expressions that start with (X) where X is an ambiguous type name or expression. Also see how parser handles possibly ambiguous cast expressions.

As a GCC extension, we allow the omission of the `then' sub-expression of a conditional expression. See the ABNF grammar.

As a GCC extension, we include statement expressions, i.e. expressions consisting of compound statements. The :stmt case of this fixtype includes the block items that comprise the compound statement.

As a GCC extension, we include calls of the built-in function __builtin_types_compatible_p. This is not a regular function, because its arguments are types names, not expressions.

As a GCC extension, we include calls of the built-in function __builtin_offsetof. This is not a regular function, because its first argument is a type name, not an expression. The second argument is a member designator, which is a restricted form of expression.

As a GCC extension, we include calls of the built-in function __builtin_va_arg. This is not a regular function, because its second argument is a type name, not an expression. The first argument is an expression.

As a GCC extesntion, we include expressions preceded by __extension__. See our ABNF grammar.

Subtopics

Exprp: Recognizer for expr structures.
Expr-case: Case macro for the different kinds of expr structures.
Expr-binary
Expr-kind: Get the kind (tag) of a expr structure.
Expr-unary
Expr-equiv: Basic equivalence relation for expr structures.
Expr-cond
Expr-complit
Expr-cast/sub-ambig
Expr-cast/mul-ambig
Expr-cast/call-ambig
Expr-cast/and-ambig
Expr-cast/add-ambig
Expr-va-arg
Expr-tycompat
Expr-offsetof
Expr-memberp
Expr-member
Expr-ident
Expr-gensel
Expr-funcall
Expr-comma
Expr-cast
Expr-arrsub
Expr-alignof
Expr-string
Expr-stmt
Expr-sizeof-ambig
Expr-sizeof
Expr-paren
Expr-extension
Expr-const
Expr-fix: Fixing function for expr structures.
Expr-count: Measure for recurring over expr structures.

Expr

Member Tags → Types

Subtopics