A syntax of C for use by tools.
We provide an abstract syntax of C for use by tools that manipulate C code, e.g. C code generators. This abstract syntax preserves (i.e. does not abstract away) much of the information in the concrete syntax, in order to afford more control on the form of the code produced by those tools.
Currently this abstract syntax covers all of C after preprocessing,
but we plan to extend it to include
at least some forms of preprocessing constructs and comments,
to afford even more control on the code produced by tools.
Supporting all possible forms of preprocessing constructs and comments
would be challenging in an abstract syntax,
because preprocessing in C operates at the lexical level.
Nonetheless, certain constructs are relatively simple
(such as
We include some constructs for GCC extensions, which, as mentioned in the top-level topic of our library for C, are prevalent and important extensions needed for a practical tool. Ideally, eventually we should support all the GCC extensions, but we are adding them piece-wise, as needed. Our documentation always distinguishes between the C standard and the GCC extensions.
The idea of this tool-oriented abstract syntax is also discussed in c::abstract-syntax and c::atc-abstract-syntax. We plan to have ATC use this new tool-oriented abstract syntax.
We provide a macro tool defpred to concisely define predicates over the abstract syntax. This should be fairly easy to generalize to a more general tool for fixtypes.
Accompanying our abstract syntax, we provide a concrete syntax, based on an ABNF grammar. This is not a different syntax for C, but just a different formulation of the syntax of C, motivated by the fact that we want this tool-oriented syntax to be neither before nor after preprocessing, but to incorporate constructs from both forms of the C code; the grammar in [C] is organized differently, with preprocessing being a distinguished translation phase [C:5.1.1.2].
We have started defining an abstraction mapping from the concrete to the abstract syntax. This is still very much work in progress, but its main purpose is for specification and verification, and is not needed to run the tools described below (parser, printer, etc.).
We provide a parser that produces abstract syntax, which covers all of the C constructs after preprocessing. The syntax of C is notoriously ambiguous, requiring some semantic analysis to disambiguate it. Instead of performing this semantic analysis during parsing, our parser captures ambiguous constructs as such, and we provide a separate disambiguator that transforms the abstract syntax, after parsing, by disambiguating it via the necessary semantic analysis.
In order to process typical C code, we provide an ACL2 tool to invoke a C preprocessor. The tool can be run on headers and source files, to obtain preprocessed source files, which can be then parsed by our parser.
We provide a validator on the abstract syntax (after disambiguation) that checks the static constraints on C code (i.e. type checking etc.), which results in an elaboration of the abstract syntax, e.g. enhancing the abstract syntax with types and other information after successful validation.
We provide a (pretty-)printer that turns our abstract syntax into concrete syntax that is valid C code. Like the parser and the abstract syntax, our printer covers all the C constructs after preprocessing. This printer is an initial version; we plan to improve it in various respects, in particular by supporting printing options (e.g. for the right margin position).
We provide a collection of predicates that characterize unambiguous abstract syntax, i.e. abstract syntax without ambiguous constructs, as resulting after disambiguation.
We provide a collection of predicates that characterize annotated abstract syntax, i.e. abstract syntax enhanced with the information added by the validator.
We provide various operations on the abstract syntax.
We provide event macros input-files and output-files to read, preprocess, parse, disambiguate, print, and write files.
We plan to prove theorems connecting this tool-oriented syntax with the formal language definition in c::language. We already provide a (partial) mapping from the tool-oriented abstract syntax to the abstract syntax of the formal language definition. We also provide predicates to identify which subset of the abstract syntax not only maps to the language definition's abstract syntax, but is also covered by the formal semantics we have so far.
All the items described above form a sub-library of our ACL2 library for C,
in the directory