Abstract-syntax
An abstract syntax of C for use by tools.
See syntax-for-tools for background.
We define fixtypes for constructs that closely correspond to
the grammar in [C], which is summarized in [C:A].
For now we cover all the constructs after preprocessing,
but we plan to add some preprocessing constructs.
We also include certain GCC extensions,
as mentioned in syntax-for-tools.
According to the rationale explained in syntax-for-tools,
here we capture much of the information from the concrete syntax,
e.g. the distinction between
the 0x and 0X hexadecimal prefixes.
We try and pick short yet clear names for these fixtypes,
so that code that manipulates these fixtypes
can be a bit more more compact.
While the grammar in [C]
uses the `hexadecimal' and `binary' qualifications
for certain hexadecimal and binary entities
but uses no qualifications for certain decimal entities,
our fixtype names are more symmetric,
using dec and hex and bin qualifiers
for certain ``parallel'' entities.
Some library fixtypes already correspond to
certain nonterminals in the grammar in [C]
and are thus not defined here, but just used.
Examples are fixtypes for digits (in different bases), and lists thereof.
The ...-list fixtypes are slightly more general than
the ...-sequence and ...-list nonterminals
in the grammar in [C],
because the former include the empty list,
while the latter only include non-empty sequences/lists.
Including empty lists in our fixtypes makes things a bit simpler,
partly because currently fty::deflist
generates conflicting theorems if used
both with :non-emptyp t and with (default) :non-emptyp nil
(although this could be remedied).
It is fine (and common) for the abstract syntax
to be more general than the concrete syntax.
Restrictions on well-formed code can be formulated
via separate predicates on the abstract syntax.
The grammar in [C] does not capture many static constraints anyhow.
The use of natural numbers to represent c-char and s-char
in character constants and string literals is motivated by
the fact that we commit to Unicode characters,
even though [C] prescribes no specific source character set [C:5.2.1].
These days, Unicode should be sufficiently general;
note that ASCII is a subset of Unicode.
Thus, the natural numbers represent Unicode code points,
which include ASCII codes as a subset.
Although natural numbers are more general that Unicode code points,
and also more general than c-char and s-char,
it is fine for abstract syntax to be more general than concrete syntax.
The syntax of C has some known ambiguities,
which cannot be disambiguated purely syntactically,
but need some (static) semantic analysis.
Some of the fixtypes of our abstract syntax
capture these ambiguous constructions,
as described when those fixtypes are introduced.
Here are some resources on the topic:
Unlike some approaches suggested in the above resources,
we prefer to defer the disambiguation of these constructs after parsing,
so that parsing is purely syntactical,
without the need for any semantic analysis during parsing.
Our abstract syntax also includes some information
that is initially absent (after parsing and disambiguation) and that is calculated by validation. This information is stored in the abstract syntax for easy access,
e.g. access by tools to transform the abstract syntax.
This additional information can be also used, in the future,
for other purposes than storing results from validation.
This information in our fixtypes is untyped,
which, in ACL2, can be regarded as analogous to
using polymorphic types for the abstract syntax,
parameterized over the type of the additional information.
Subtopics
- Expr
- Fixtype of expressions [C:6.5] [C:A.2.1].
- Exprs/decls/stmts
- Fixtypes of expressions, declarations, statements,
and related entities
[C:6.5] [C:6.6] [C:6.7] [C:6.8] [C:A.2.1] [C:A.2.2] [C:A.2.3].
- Type-spec
- Fixtype of type specifiers [C:6.7.3] [C:A.2.2].
- Binop
- Fixtype of binary operators
[C:6.5.5-14] [C:6.5.16] [C:A.2.1].
- Stmt
- Fixtype of statements [C:6.8] [C:A.2.3].
- Amb-expr/tyname
- Fixtype of ambiguous expressions or type names.
- Dirdeclor
- Fixtype of direct declarators [C:6.7.6] [C:A.2.2].
- Unop
- Fixtype of unary operators
[C:6.5.3] [C:6.5.2] [C:A.2.1].
- Asm-stmt
- Fixtype of assembler statements.
- Dec/oct/hex-const
- Fixtype of decimal, octal, and hexadecimal constants
[C:6.4.4.1] [C:A.1.5].
- Simple-escape
- Fixtype of simple escape sequences [C:6.4.4.4] [C:A.1.5].
- Attrib
- Fixtype of GCC attributes.
- Amb-decl/stmt
- Fixtype of ambiguous declarations or statements.
- Dirabsdeclor
- Fixtype of direct abstract declarators [C:6.7.7] [C:A.2.2].
- Ident
- Fixtype of identifiers [C:6.4.2] [C:A.1.3].
- Decl-spec
- Fixtype of declaration specifiers [C:6.7] [C:A.2.2].
- Structdecl
- Fixtype of structure declarations [C:6.7.2.1] [C:A.2.2].
- Fundef
- Fixtype of function definitions [C:6.9.1] [C:A.2.4].
- Asm-name-spec
- Fixtype of GCC assembler name specifiers.
- Amb-declor/absdeclor
- Fixtype of ambiguous declarators or abstract declarators.
- Transunit-ensemble
- Fixtype of ensembles of translation units.
- Paramdeclor
- Fixtype of parameter declarators [C:6.7.6] [C:A.2.2].
- Declor
- Fixtype of declarators [C:6.7.6] [C:A.2.2].
- Type-qual
- Fixtype of type qualifiers [C:6.7.3] [C:A.2.2].
- Absdeclor
- Fixtype of abstract declarators [C:6.7.7] [C:A.2.2].
- Stor-spec
- Fixtype of storage class specifiers [C:6.7.1] [C:A.2.2].
- Keyword-uscores
- Fixtype of keyword underscores.
- Strunispec
- Fixtype of structure or union specifiers [C:6.7.2.1] [C:A.2.2].
- Align-spec
- Fixtype of alignment specifiers [C:6.7.5] [C:A.2.2].
- Lsuffix
- Fixtype of length suffixes [C:6.4.4.1] [C:A.1.5].
- Label
- Fixtype of labels [C:6.8.1] [C:A.2.3].
- Expr-option
- Fixtype of optional expressions.
- Spec/qual
- Fixtype of specifiers and qualifiers
[C:6.7.2.1] [C:A.2.2].
- Hex-frac-const
- Fixtype of hexadecimal fractional constants [C:6.4.4.2] [C:A.1.5].
- Dec-frac-const
- Fixtype of decimal fractional constants [C:6.4.4.2] [C:A.1.5].
- Dec-core-fconst
- Fixtype of decimal core floating constants [C:6.4.4.2] [C:A.1.5].
- Tyname
- Fixtype of type names [C:6.7.7] [C:A.2.2].
- Dirabsdeclor-option
- Fixtype of optional direct abstract declarators.
- Desiniter
- Fixtype of initializers with optional designations
[C:6.7.9] [C:A.2.2].
- Extdecl
- Fixtype of external declarations [C:6.9] [C:A.2.4].
- Amb?-declor/absdeclor
- Fixtype of possibly ambiguous declarators or abstract declarators.
- Isuffix
- Fixtype of integer suffixes [C:6.4.4.1] [C:A.1.5].
- Attrib-spec
- Fixtype of GCC attribute specifiers.
- Initdeclor
- Fixtype of initializer declarators [C:6.7] [C:A.2.2].
- Hex-core-fconst
- Fixtype of hexadecimal core floating constants [C:6.4.4.2] [C:A.1.5].
- Escape
- Fixtype of escape sequences [C:6.4.4.4] [C:A.1.5].
- Decl
- Fixtype of declarations [C:6.7] [C:A.2.2].
- Const-expr-option
- Fixtype of optional constant expressions.
- Amb?-expr/tyname
- Fixtype of possibly ambiguous expressions or type names.
- Structdeclor
- Fixtype of structure declarators [C:6.7.2.1] [C:A.2.2].
- Bin-expo
- Fixtype of binary exponents [C:6.4.4.2] [C:A.1.5].
- Statassert
- Fixtype of static assertion declarations [C:6.7.10] [C:A.2.2].
- Ident-option
- Fixtype of optional identifiers.
- Fsuffix
- Fixtype of floating suffixes [C:6.4.4.2] [C:A.1.5].
- Enumspec
- Fixtype of enumeration specifiers [C:6.7.2.2] [C:A.2.2].
- Absdeclor-option
- Fixtype of optional abstract declarators [C:6.7.7] [C:A.2.2].
- Paramdecl
- Fixtype of parameter declarations [C:6.7.6] [C:A.2.2].
- Member-designor
- Fixtype of member designators.
- Initer-option
- Fixtype of optional initializers.
- Initer
- Fixtype of initializers [C:6.7.9] [C:A.2.2].
- Declor-option
- Fixtype of optional declarators.
- Const
- Fixtype of constants [C:6.4.4] [C:A.1.5].
- Block-item
- Fixtype of block items [C:6.8.2] [C:A.2.3].
- Oct-escape
- Fixtype of octal escape sequences [C:6.4.4.4] [C:A.1.5].
- Inc/dec-op
- Fixtype of increment and decrement operators
[C:6.5.3] [C:6.5.2] [C:A.2.1].
- Iconst
- Fixtype of integer constants [C:6.4.4.1] [C:A.1.5].
- Hex-quad
- Fixtype of quadruples of hexadecimal digits [C:6.4.3] [C:A.1.4].
- Fun-spec
- Fixtype of function specifiers [C:6.7.4] [C:A.2.2].
- Eprefix
- Fixtype of encoding prefixes [C:6.4.5] [C:A.1.6].
- Dec-expo
- Fixtype of decimal exponents [C:6.4.4.2] [C:A.1.5].
- Cprefix-option
- Fixtype of optional prefixes of character constants.
- Const-expr
- Fixtype of constant expressions [C:6.6] [C:A.2.1].
- Asm-name-spec-option
- Fixtype of optional assembler name specifiers.
- Amb?-decl/stmt
- Fixtype of possibly ambiguous declarations or statements.
- S-char
- Fixtype of characters and escape sequences
usable in string literals [C:6.4.5] [C:A.1.6].
- Isuffix-option
- Fixtype of optional integer suffixes.
- Fundef-option
- Fixtype of optional function definitions.
- Fsuffix-option
- Fixtype of optional floating suffixes.
- Fconst
- Fixtype of floating constants [C:6.4.4.2] [C:A.1.5].
- Eprefix-option
- Fixtype of optional encoding prefixes.
- Dec-expo-option
- Fixtype of optional decimal exponents.
- Cprefix
- Fixtype of prefixes of character constants [C:6.4.4.4] [C:A.1.5].
- C-char
- Fixtype of characters and escape sequences
usable in character constants [C:6.4.4.4] [C:A.1.5].
- Sign-option
- Fixtype of optional signs.
- Iconst-option
- Fixtype of optional integer constants.
- Genassoc
- Fixtype of generic associations [C:6.5.1.1] [C:A.2.1].
- Asm-output
- Fixtype of assembler output operands.
- Asm-input
- Fixtype of assembler input operands.
- Const-option
- Fixtype of optional constants.
- Usuffix
- Fixtype of unsigned suffixes [C:6.4.41] [C:A.1.5].
- Univ-char-name
- Fixtype of universal character names [C:6.4.3] [C:A.1.4].
- Designor
- Fixtype of designators [C:6.7.9] [C:A.2.2].
- Attrib-name
- Fixtype of attribute names.
- Sign
- Fixtype of signs [C:6.4.4.2] [C:A.1.5].
- Hprefix
- Fixtype of hexadecimal prefixes [C:6.4.4.1] [C:A.1.5].
- Asm-qual
- Fixtype of assembler qualifiers.
- Typequal/attribspec
- Fixtype of type qualifiers and attribute specifiers.
- Enumer
- Fixtype of enumerators [C:6.7.2.2] [C:A.2.2].
- Dec-expo-prefix
- Fixtype of decimal exponent prefixes [C:6.4.4.2] [C:A.1.5].
- Bin-expo-prefix
- Fixtype of binary exponent prefixes [C:6.4.4.2] [C:A.1.5].
- Stringlit
- Fixtype of string literals [C:6.4.5] [C:A.1.6].
- Transunit
- Fixtype of translation units [C:6.9] [C:A.2.4].
- Cconst
- Fixtype of character constants [C:6.4.4.4] [C:A.1.5].
- Expr/tyname
- Fixtype of expressions or type names.
- Decl/stmt
- Fixtype of declarations or (expression) statements.
- Declor/absdeclor
- Fixtype of declarators or abstract declarators.
- Asm-clobber
- Fixtype of assembler clobbers.
- Typequal/attribspec-list
- Fixtype of lists of type qualifiers and attribute specifiers.
- Stringlit-list
- Fixtype of lists of string literals.
- Inc/dec-op-list
- Fixtype of lists of increment and decrement operators.
- Decl-spec-list
- Fixtype of lists of declaration specifiers.
- Spec/qual-list
- Fixtype of lists of specifiers and qualifiers.
- S-char-list
- Fixtype of lists of characters and escape sequences
usable in string literals [C:6.4.5] [C:A.1.6].
- Decl-list
- Fixtype of lists of declarations.
- C-char-list
- Fixtype of lists of characters and escape sequences
usable in character constants [C:6.4.4.4] [C:A.1.5].
- Typequal/attribspec-list-list
- Fixtype of lists of lists of
type qualifiers and attribute specifiers.
- Structdeclor-list
- Fixtype of lists of structure declarators.
- Structdecl-list
- Fixtype of lists of structure declarations.
- Paramdecl-list
- Fixtype of lists of parameter declarations.
- Initdeclor-list
- Fixtype of lists of initializer declarators.
- Genassoc-list
- Fixtype of lists of generic associations.
- Filepath-transunit-map
- Fixtype of omaps from file paths to translation units.
- Extdecl-list
- Fixtype of lists of external declarations.
- Desiniter-list
- Fixtype of lists of initializers with optional designations.
- Enumer-list
- Fixtype of lists of enumerators.
- Block-item-list
- Fixtype of lists of block items.
- Attrib-spec-list
- Fixtype of lists of GCC attribute specifiers.
- Type-spec-list
- Fixtype of lists of type specifiers.
- Ident-list
- Fixtype of lists of identifiers.
- Expr-list
- Fixtype of lists of expressions.
- Asm-qual-list
- Fixtype of lists of assembler qualifiers.
- Asm-output-list
- Fixtype of lists of assembler output operands.
- Asm-input-list
- Fixtype of lists of assembler input operands.
- Asm-clobber-list
- Fixtype of lists of assembler clobbers.
- Stor-spec-list
- Fixtype of lists of storage class specifiers.
- Ident-set
- Fixtype of sets of identifiers.
- Ident-option-set
- Fixtype of sets of optional identifiers.
- Eprefix-option-list
- Fixtype of lists of optional encoding prefixes.
- Designor-list
- Fixtype of lists of designators.
- Attrib-list
- Fixtype of lists of GCC attributes.