Disambiguator
Disambiguator of the C abstract syntax for tools.
As discussed in the abstract syntax, the syntax of C is inherently syntactically ambiguous,
in ways that can be disambiguated only via a (static) semantic analysis.
Here we define such semantic analysis, which we call `disambiguator'.
The disambiguator does not check the full static validity of the code,
but it performs sufficient checks to disambiguate
the abstract syntax trees generated by the parser.
The disambiguator transforms the abstract syntax trees,
generating unambiguous ones.
The abstract syntax trees generated by the parser
contains the following ambiguities,
which the disambiguator resolves as indicated:
- Identifiers used as expressions
are always classified as identifier expressions
(i.e. the :ident case of expr)
by the parser,
but some of them may be enumeration constants.
The disambiguator re-classifies the latter as required.
- Some constructs include ambiguous expressions or type names,
represented by the type amb-expr/tyname.
The disambiguator turns those constructs into unambiguous ones,
by choosing either an expression or a type name.
- Some constructs include ambiguous declarators or abstract declarators,
represented by the type amb-declor/absdeclor.
The disambiguator turns those constructs into unambiguous ones,
by choosing either a declarator or an abstract declarator.
- Some constructs include ambiguous declarations or statements,
represented by the type amb-decl/stmt.
The disambiguator turns those constructs into unambiguous ones,
by choosing either a declaration or a statement.
- Function declarators whose parameters are all identifiers
are always classified as parameter type lists by the parser,
but some of them may be identifier lists.
The disambiguator re-classifies the latter as required.
- Some expressions may be cast expressions or binary expressions,
represented by the :cast/...-ambig cases of expr.
The disambiguator turns these into
unambiguous case or binary expressions.
- The initializing part of a for loop may be ambiguously
a declaration or an expression followed by a semicolon,
represented by the :for-ambig case of stmt.
The disambiguator turns these ambiguous for loops
into unambiguous ones.
The disambiguator does not perform a full (static) semantic analysis,
but only a light-weight one, enough for disambiguation.
If the code is statically valid, the disambiguation must work.
If the code is not statically valid, the disambiguation may fail;
in some cases, the disambiguator reports the cause of invalidity,
but in other cases it may not have enough information.
The disambiguation is performed by scanning the code,
transforming it as applicable (re-classifying or choosing constructs).
This needs certain information about the identifiers in the code,
which we build and use.
For instance, when we encounter an enumeration specifier,
we need to record the information that the identifiers of the enumerators
are enumeration constants,
so that, if we later encounter an expression with that identifier,
we can re-classify from an identifier expression to an enumeration constant
(this is the first of the ambiguities listed above).
In essence, we need a symbol table of identifiers;
not a full one that would be needed to check the full validity of the code,
but one with sufficient information to disambiguate.
We need to take into account the scoping rules of C of course,
since the same identifier
may have different meaning in different scopes.
We call these symbol tables `disambiguation tables'.
We use error-value tuples to handle errors in the disambiguator.
The fixtype and functions in the implementation of the disambiguator
are prefixed by dimb, which stands for `DIsaMBiguator'.
For now we leave the GCC extensions unchanged,
i.e. we do not apply the disambiguation to them,
even though they may contain expression.
Some initial experiments reveal that we will need to
treat these GCC extensions in a more dedicated way.
For instance, the access attribute may include
the arguments read_only, write_only, and read_write.
Grammatically these are expressions,
but it seems that they have just a special meaning, like keywords,
in the context of the access attribute,
but they are likely not keywords elsewhere.
A naive treatment would attempt to resolve those arguments,
which are not in the disambiguation table.
Subtopics
- Dimb-exprs/decls/stmts
- Disambiguate expressions, declarations, statements,
and related artifacts.
- Dimb-make/adjust-expr-binary
- Build, and adjust if needed, a binary expression.
- Dimb-transunit
- Disambiguate a translation unit.
- Dimb-params-to-names
- Disambiguate a list of parameter declarations to a list of names,
if appropriate.
- Dimb-cast/call-to-call
- Disambiguate an ambiguous cast or call expression
to be a call expression.
- Dimb-make/adjust-expr-cast
- Build, and adjust if needed, a cast expression.
- Dimb-fundef
- Disambiguate a function definition.
- Dimb-make/adjust-expr-unary
- Build, and adjust if needed, a unary expression.
- Dimb-expr
- Disambiguate an expression.
- Dimb-dirdeclor
- Disambiguate a direct declarator.
- Dimb-amb-declor/absdeclor
- Disambiguate an ambiguous declarator or abstract declarator.
- Dimb-cast/call-to-cast
- Disambiguate an ambiguous cast or call expression
to be a cast expression.
- Dimb-cast/addsub-to-cast
- Disambiguate an ambiguous cast or addition/subtraction expression
to be a cast expression.
- Dimb-cast/addsub-to-addsub
- Disambiguate an ambiguous cast or addition/subtraction expression
to be an addition or subtraction expression.
- Dimb-transunit-ensemble
- Disambiguate a translation unit ensembles.
- Dimb-add-ident
- Add an identifier and its kind to the disambiguation table.
- Dimb-kind
- Fixtype of kinds of identifiers in disambiguation tables.
- Dimb-declor
- Disambiguate a declarator.
- Dimb-amb-expr/tyname
- Disambiguate an ambiguous expression or type name.
- Dimb-cast/mul-to-cast
- Disambiguate an ambiguous cast or multiplication expression
to be a cast expression.
- Dimb-cast/and-to-cast
- Disambiguate an ambiguous cast or conjunction expression
to be a cast expression.
- Dimb-cast/mul-to-mul
- Disambiguate an ambiguous cast or multiplication expression
to be a multiplication expression.
- Dimb-cast/and-to-and
- Disambiguate an ambiguous cast or conjunction expression
to be a conjunction expression.
- Dimb-kind-option
- Fixtype of optional kinds of identifiers in disambiguation tables.
- Dimb-type-spec
- Disambiguate a type specifier.
- Dimb-extdecl-list
- Disambiguate a list of external declarations.
- Dimb-extdecl
- Disambiguate an external declaration.
- Dimb-lookup-ident
- Look up an identifier in the disambiguation table.
- Dimb-paramdeclor
- Disambiguate a parameter declarator.
- Dimb-decl-spec
- Disambiguate a declaration specifier.
- Dimb-add-idents-objfun
- Add all the identifiers in a list to the disambiguation table,
with object or function kind.
- Dimb-paramdecl
- Disambiguate a parameter declaration.
- Dimb-add-ident-objfun
- Add an identifier to the disambiguation table,
with object or function kind.
- Dimb-table
- Fixtype of disambiguation tables.
- Dimb-amb-decl/stmt
- Disambiguate an ambiguous declaration or statement.
- Dimb-pop-scope
- Pop a scope from the disambiguation table.
- Dimb-push-scope
- Push a scope into the disambiguation table.
- Dimb-initdeclor
- Disambiguate an initializer declarator.
- Dimb-declor-option
- Disambiguate an optional declarator.
- Dimb-enumspec
- Disambiguate an enumeration specifier.
- Dimb-decl
- Disambiguate a declaration.
- Dimb-stmt
- Disambiguate a statement.
- Dimb-scope
- Fixtype of scopes in disambiguation tables.
- Dimb-structdeclor
- Disambiguate a structure declarator.
- Dimb-initdeclor-list
- Disambiguate a list of initializer declarators.
- Dimb-decl-spec-list
- Disambiguate a list of declaration specifiers.
- Dimb-absdeclor
- Disambiguate an abstract declarator.
- Dimb-init-table
- Initial disambiguation table.
- Dimb-dirabsdeclor
- Disambiguate a direct abstract declarator.
- Dimb-align-spec
- Disambiguate an alignment specifier.
- Dimb-strunispec
- Disambiguate a structure or union specifier.
- Irr-dimb-table
- An irrelevant disambiguation table.
- Irr-dimb-kind
- An irrelevant kind of identifiers in disambiguation tables.
- Dimb-spec/qual-list
- Disambiguate a list of specifiers and qualifiers.
- Dimb-spec/qual
- Disambiguate a specifier or qualifier.
- Dimb-paramdecl-list
- Disambiguate a list of parameter declarations.
- Dimb-label
- Disambiguate a label.
- Dimb-enumer-list
- Disambiguate a list of enumerators.
- Dimb-enumer
- Disambiguate an enumerator.
- Dimb-dirabsdeclor-option
- Disambiguate an optional direct abstract declarator.
- Dimb-block-item
- Disambiguate a block item.
- Dimb-structdeclor-list
- Disambiguate a list of structure declarators.
- Dimb-structdecl-list
- Disambiguate a list of structure declarations.
- Dimb-statassert
- Disambiguate a static assertion declaration.
- Dimb-desiniter-list
- Disambiguate a list of initializers with optional designations.
- Dimb-desiniter
- Disambiguate an initializer with optional designations.
- Dimb-decl-list
- Disambiguate a list of declarations.
- Dimb-const-expr-option
- Disambiguate an optional constant expression.
- Dimb-absdeclor-option
- Disambiguate an optional abstract declarator.
- Dimb-structdecl
- Disambiguate a structure declaration.
- Dimb-member-designor
- Disambiguate a member designator.
- Dimb-initer-option
- Disambiguate an optional initializer.
- Dimb-initer
- Disambiguate an initializer.
- Dimb-genassoc-list
- Disambiguate a list of generic associations.
- Dimb-genassoc
- Disambiguate a generic association.
- Dimb-expr-option
- Disambiguate an optional expression.
- Dimb-expr-list
- Disambiguate a list of expressions.
- Dimb-designor-list
- Disambiguate a list of designators.
- Dimb-designor
- Disambiguate a designator.
- Dimb-const-expr
- Disambiguate a constant expression.
- Dimb-block-item-list
- Disambiguate a list of block items.
- Dimb-tyname
- Disambiguate a type name.