Atc
ATC (ACL2 To C),
a proof-generating C code generator for ACL2.
Introduction
This manual page contains user-level reference documentation for ATC.
Users who are new to ATC should start with the tutorial, which provides user-level pedagogical information
on how ATC works and how to use ATC effectively.
This manual page refers to the official C standard
in the manner explained in the top-level XDOC topic of our C library.
General Form
(atc t1 ... tp
:output-dir ... ; default "."
:file-name ... ; no default
:header ... ; default nil
:pretty-printing ... ; default nil
:proofs ... ; default t
:const-name ... ; default :auto
:print ... ; default :result
)
Inputs
t1, ..., tp
Names of ACL2 targets to translate to C.
Each ti must be a symbol that satisfies
exactly one of the following conditions:
- It is the symbol of an ACL2 function.
- Its symbol-name is the name of
a C structure type shallowly embedded in ACL2,
introduced via defstruct.
- Its symbol-name is the name of
a C external object shallowly embedded in ACL2,
introduced via defobject.
It is an error if two or more of the above conditions hold,
because in that case the target does not unambiguously identify
a function or a structure type or an external object.
The symbol-names of the targets must be all distinct.
Even though C has different name spaces
for (i) structure types and (ii) functions and external objects
[C:6.2.3],
in C code generated by ATC these are treated
as if they were in the same name space,
so they must all have distinct names.
There must be at least one target function.
Each recursive target function must be called by
at least one recursive or non-recursive target function
that follows it in the list (t1 ... tp).
Each target function must satisfy the conditions discussed in
the section `Representation of C Code in ACL2'.
:output-dir — default "."
Path of the directory where the generated C code goes.
This must be an ACL2 string that is
a valid path to an existing directory in the file system;
the path may be absolute,
or relative to the connected book directory. The default is the connected book directory.
:file-name — no default
Name of the files that contain the generated C code,
without the .h or .c extension.
This must be a non-empty ACL2 string that consists of
ASCII letters, digits, underscores, and dashes.
The full names of the generated files are obtained
by appending the extension .h or .c to this string.
The files are generated
in the directory specified by :output-dir.
The files may or may not exist:
if they do not exist, they are created;
if they exist, they are overwritten.
:header — default nil
Specifies whether a header (i.e. a .h file)
should be generated or not.
This must be one of the following:
- t, to generate a header.
- nil, to not generate a header.
A source file (i.e. a .c file is always generated;
the :header input only affects the generation of the header.
If ATC is used to generate C code that is not standalone
but is meant to be called by external C code,
the :header input should be t,
so that the external C code can include the header.
If ATC is used to generate standalone C code,
presumably including a function called main with appropriate types,
then :header input should be nil.
:pretty-printing — default nil
Specifies options for how the generated C code is pretty-printed.
This must be a keyword-value list (opt-name1 opt-value1 opt-name2 opt-value2 ...)
where each opt-namek is a keyword among the ones described below,
and each opt-valuek is one of the allowed values
for the corresponding keyword as described below.
The following pretty-printing options are supported:
This is currently the only supported pretty-printing option.
More options,
to control additional aspects of the pretty-print of the C code,
may be added in the future.
:proofs — default t
Specifies whether proofs should be generated or not.
This must be one of the following:
- t, to generate proofs.
- nil, to not generate proofs.
While it is obviously recommended to generate proofs,
setting this to nil may be useful
in case proof generation is (temporarily) broken.
:const-name — default :auto
Name of the generated ACL2 named constant
that holds the abstract syntax tree of the generated C program.
This must be one of the following:
- :auto, to use the symbol *program*
in the "C" package.
- Any other symbol, to use as the name of the constant.
In the rest of this documentation page,
let *program* be the symbol specified by this input,
if applicable (i.e. when :proofs is t).
:print — default :result
Specifies what is printed on the screen.
It must be one of the following:
- :error, to print only error output (if any).
- :result, to print, besides any error output,
also the results of ATC. This is the default value of the :print input.
- :info, to print,
besides any error output and the results,
also some additional information about
the internal operation of ATC.
- :all, to print,
besides any error output,
the results,
and the additional information,
also ACL2's output in response to all the submitted events.
The errors are printed as error output. The results and the additional information are printed as comment output. The ACL2 output enabled by :print :all may consist of output of various kinds.
If :print is :error or :result or :info,
ATC suppresses all kinds of outputs (via with-output)
except for error and comment output.
Otherwise, ATC does not suppress any output.
However, the actual output depends on
which outputs are enabled or not prior to the call of ATC,
including any with-output that may wrap the call of ATC.
Representation of C Code in ACL2
Currently ATC supports the ACL2 representation of
a single source file (i.e. a file with extension .c),
optionally accompanied by a header (i.e. a file with extension .h),
based on the :header input.
If :header is nil,
the source file consists of
one or more C function definitions,
zero or more C external object declarations,
and zero or more C structure type declarations.
If :header is t,
the header consists of
one or more function declarations,
zero or more C external object declarations without initializers,
and zero or more C structure type declarations,
while the source file consists of
one or more function definitions
and zero or more C external object declarations with initializers,
corresponding to
the function declarations and the external object declarations
in the header.
Each C structure type declaration is represented by a defstruct,
whose name is passed as one of the targets ti to ATC.
The symbol name, which is a portable ASCII C identifier (this is enforced by defstruct),
represents the tag of the C structure type.
Each C external object declaration is represented by a defobject,
whose name is passed as one of the targets ti to ATC.
The symbol name, which is a portable ASCII C identifier (this is enforced by defobject),
represents the name of the C external object.
Each C function definition (and declaration in the header, if present)
is represented by an ACL2 function definition.
These are the non-recursive target ACL2 functions ti passed to ATC;
the recursive target ACL2 functions ti passed as inputs
represent loops in the C functions instead, as explained below.
The order of the C structure types and external objects and functions
in the files
is the same as the order of the corresponding targets
in the list (t1 ... tp) passed to ATC.
Each target function fn must be in logic mode and guard-verified.
The function must not occur in its own guard,
which is rare but allowed in ACL2.
The symbol name of each non-recursive function target fn
must be a portable ASCII C identifier. That symbol name is used as the name of the corresponding C function.
Therefore, the non-recursive target functions
must have all distinct symbol names;
even if they are in different packages,
they must have distinct symbol names
(the package names are ignored).
There is no restriction on
the symbol names of the recursive target functions:
these represent C loops, not C functions;
the names of the recursive target functions
are not represented at all in the C code.
The symbol name of each formal parameter of each function target fn,
both non-recursive and recursive,
must be a portable ASCII C identifier. When fn is non-recursive,
the symbol names of its parameters are used as
the names of the formal parameters of the corresponding C function,
in the same order,
except for the formal parameters that represent external objects,
as defined below.
Therefore, the formal parameters of each fn
must have all distinct symbol names;
even if they are in different packages,
they must have distinct symbol names
(the package names are ignored).
When fn is recursive,
the symbol names of its parameters are used as names of C variables,
as explained below.
When fn is recursive,
it must have at least one formal parameter.
When fn is non-recursive,
it may have any number of formal parameters, including none.
If fn is recursive,
it must be singly (not mutually) recursive,
its well-founded relation must be o<,
and its measure must yield a natural number.
The latter requirement is checked via an applicability condition,
as described in the `Applicability Conditions' section.
The guard of each fn must include,
for every formal parameter x,
exactly one conjunct of one of the following forms,
possibly wrapped with mbt,
which determines the C type of
the corresponding parameter of the C function,
or designates the formal parameter as representing
an access to an object defined by defobject:
- (scharp x), representing signed char.
- (ucharp x), representing unsigned char.
- (sshortp x), representing signed short.
- (ushortp x), representing unsigned short.
- (sintp x), representing signed int.
- (uintp x), representing unsigned int.
- (slongp x), representing signed long.
- (ulongp x), representing unsigned long.
- (sllongp x), representing signed long long.
- (ullongp x), representing unsigned long long.
- (star (scharp x)), representing signed char *.
- (star (ucharp x)), representing unsigned char *.
- (star (sshortp x)), representing signed short *.
- (star (ushortp x)), representing unsigned short *.
- (star (sintp x)), representing signed int *.
- (star (uintp x)), representing unsigned int *.
- (star (slongp x)), representing signed long *.
- (star (ulongp x)), representing unsigned long *.
- (star (sllongp x)), representing signed long long *.
- (star (ullongp x)), representing unsigned long long *.
- (schar-arrayp x), representing signed char [].
- (uchar-arrayp x), representing unsigned char [].
- (sshort-arrayp x), representing signed short [].
- (ushort-arrayp x), representing unsigned short [].
- (sint-arrayp x), representing signed int [].
- (uint-arrayp x), representing unsigned int [].
- (slong-arrayp x), representing signed long [].
- (ulong-arrayp x), representing unsigned long [].
- (sllong-arrayp x), representing signed long long [].
- (ullong-arrayp x), representing unsigned long long [].
- (struct-<tag>-p x),
where <tag> is one of the defstruct targets ti,
representing the corresponding C structure type,
struct <tag>.
The structure type must not have a flexible array member.
The <tag> target must precede fn
in the list of targets (t1 ... tp).
- (star (struct-<tag>-p x)),
where <tag> is one of the defstruct targets ti,
representing a pointer type to the corresponding C structure type,
struct <tag> *.
The <tag> target must precede fn
in the list of targets (t1 ... tp).
- (object-<name>-p x),
where <name> is one of the defobject targets ti,
representing an access to the external object,
which must be an explicit formal parameter in functional ACL2,
while the C function accesses it directly.
The <name> target must precede fn
in the list of targets (t1 ... tp).
The conjuncts may be at any level of nesting,
but must be extractable by flattening
the and structure of the guard term:
only conjuncts extractable this way count
for the purpose of determining
the C types of the formal parameters
and which formal parameters represent accesses to external objects.
The rest of the guard (i.e. other than the conjuncts above)
is not explicitly represented in the C code.
Note the distinction between pointer types <integer type> *
and array types <integer type> [] in the list above.
Even though these types are equivalent for function parameters,
and in fact array types are adjusted to pointer types
for function parameters according to [C],
pointed-to integers and integer arrays are different
in the ACL2 representation of C.
(More in general, even in handwritten C code,
using array types instead of pointer types for function parameters
is useful to convey the intention that something is
a pointer to (the first or some) element of an integer array
as opposed to a pointer to a single integer.)
The aforementioned option of wrapping the conjuncts with mbt
is useful for cases in which the function have stronger guards
that imply those conjuncts.
It would be inelegant to add redundant conjuncts to the guard
(presumably via external transformations)
for the sole purpose of communicating function parameter types to ATC.
By allowing mbt,
the redundancy can be explicated and proved
(as part of guard verification),
while making it easy for ATC to obtain the types.
The return type of
the C function corresponding to each non-recursive target function fn
is automatically determined from the body, as explained below.
The unnormalized body of each fn must be as follows:
- If fn is non-recursive, the unnormalized body must be
a statement term for fn with loop flag nil
returning type T and affecting variables vars
(as defined below),
where each variable in vars
is a formal parameter of fn with pointer or array type
and where T is not void if vars is nil.
(In general, vars is a subset of
the formal parameters of fn with pointer or array type;
that is, not all such formal parameters are necessarily affected.)
The return type of the C function represented by fn is T.
- If fn is recursive, the unnormalized body must be
a loop term for fn affecting variables vars
(as defined below),
where each variable in vars
is a formal parameter of fn.
The above-mentioned notions of
(i) statement term for fn with loop flag L
returning T and affecting vars and
(ii) loop term for fn affecting vars
are defined below, along with the notions of
(iii) expression term for fn
returning T and affecting vars,
(iv) pure expression term for fn returning T,
(v) expression term for fn returning boolean,
(vi) C type of a variable, and
(vii) assignable variable.
A statement term for fn with loop flag L
returning T and affecting vars,
where fn is a target function,
L is either t or nil,
T is a C type (possibly void),
and vars is a list of distinct symbols,
is inductively defined as one of the following:
- An expression term for fn
returning T and affecting vars,
when L is nil,
T is a non-void non-pointer non-array C type,
and vars is nil.
That is, an expression term returning a C value is also
a statement term returning that C value.
This represents a C return statement
whose expression is represented by the same term,
viewed as an expression term.
- A term (mv ret var1 ... varn),
when ret is an expression term for fn
returning T and affecting no variables,
L is nil,
T is a non-void non-pointer non-array type,
and vars is the list (var1 ... varn) with n ≥ 1.
This represents a C return statement
whose expression is represented by ret;
the mv and the variables in vars represent no actual C code:
they just represent variables that may have been modified
by preceding code.
- A term var,
when L is nil,
T is void,
and vars is the singleton list (var).
This represents no actual C code:
it just serves to conclude
preceding code that may modify var.
- A term (mv var1 ... varn),
when L is nil,
T is void,
and vars is the list (var1 ... varn) with n > 1.
This represents no actual C code:
it just serves to conclude
preceding code that may modify var1, ..., varn.
- A call of fn on variables identical to its formal parameters,
when the C types of the variables are
the same as the C types of the formal parameters,
L is t,
T is void,
and fn is recursive.
This represents no actual C code:
it just serves to conclude
the computation in the body of the loop represented by fn.
- A call of if on
(i) a test of the form (mbt ...),
(ii) a `then' branch that is
a statement term for fn with loop flag L
returning T and affecting vars, and
(iii) an `else' branch that may be any ACL2 term.
This represents the same C code represented by the `then' branch.
Both the test and the `else' branch are ignored,
because ATC generates C code under guard assumptions.
- A call of if on
(i) a test that is an expression term for fn returning boolean and
(ii) branches that are statement terms for fn with loop flag L
returning T and affecting vars.
This represents a C if conditional statement
whose test expression is represented by the test term
and whose branch blocks are represented by the branch terms.
- A term (let ((var (declar term))) body),
when the symbol name of var is a portable ASCII C identifier, the symbol name of var is distinct from
the symbol names of all the other ACL2 variables in scope,
term is an expression term for fn
returning a non-void non-pointer non-array C type
and affecting no variables, and
body is a statement term for fn with loop flag L
returning T and affecting vars.
This represents, as indicated by the wrapper declar,
a declaration of a C local variable represented by var,
initialized with the C expression represented by term,
followed by the C code represented by body.
The C type of the variable is determined from the initializer.
- A term (let ((var (assign term))) body),
when var is assignable,
term is an expression term for fn
returning the same non-void non-pointer non-array C type
as the C type of var
and affecting no variables, and
body is a statement term for fn with loop flag L
returning T and affecting vars.
This represents, as indicated by the wrapper assign,
an assignment to
the C local variable or function parameter represented by var,
with the C expression represented by term as right-hand side,
followed by the C code represented by body.
- A term
(let ((var (<type>-write term))) body),
when <type> is among
- schar
- uchar
- sshort
- ushort
- sint
- uint
- slong
- ulong
- sllong
- ullong
var is in scope,
var has a pointer type whose referenced type is
the C integer type corresponding to <type>,
var is one of the symbols in vars,
term is a pure expression term for fn
returning the C integer type corresponding to <type>,
body is a statement term for fn with loop flag L
returning T and affecting vars.
This represents a C assignment to
the pointer variable represented by var
with the value of the expression represented by term;
the wrapper <type>-write signifies
the writing to an integer by pointer. - A term
(let ((var (<type>-array-write var term1 term2))) body),
when <type> is among
- schar
- uchar
- sshort
- ushort
- sint
- uint
- slong
- ulong
- sllong
- ullong
var is in scope,
var has an array type whose element type is
the C integer type corresponding to <type>,
var is one of the symbols in vars,
term1 is a pure expression term for fn
returning a C integer type,
term2 is a pure expression term for fn
returning the C integer type corresponding to <type>,
body is a statement term for fn with loop flag L
returning T and affecting vars.
This represents a C assignment to
an element of the array represented by var
with the subscript expression represented by term1
with the new element expression represented by term2,
followed by the C code represented by body. - A term
(let ((var (struct-<tag>-write-<member> term var))) body),
when <tag> is a defstruct name,
<member> is the name of
one of the members of that defstruct,
<member> has an integer type in the defstruct,
var is assignable,
var has the C structure type represented by <tag>,
term is a pure expression term for fn
returning the C integer type of <member>,
body is a statement term for fn with loop flag L
returning T and affecting vars.
This represents a C assignment to
a member of the structure represented by var
by value (i.e. using .)
with the new value expression represented by term,
followed by the C code represented by body.
- A term
(let ((var (struct-<tag>-write-<member> term var))) body),
when <tag> is a defstruct name,
<member> is the name of
one of the members of that defstruct,
<member> has an integer type in the defstruct,
var is in scope,
var has a pointer type whose referenced type is
the C structure type represented by <tag>,
var is one of the symbols in vars,
term is a pure expression term for fn
returning the C integer type of <member>,
body is a statement term for fn with loop flag L
returning T and affecting vars.
This represents a C assignment to
a member of the structure represented by var
by pointer (i.e. using ->)
with the new value expression represented by term,
followed by the C code represented by body.
- A term
(let
((var (struct-<tag>-write-<member>-element term1 term2 var)))
body),
when <tag> is a defstruct name,
<member> is the name of
one of the members of that defstruct,
<member> has an integer array type in the defstruct,
var is assignable,
var has the C structure type represented by <tag>
or the pointer type to that C structure type,
term1 is a pure expression term for fn
returning a C integer type,
term2 is a pure expression term for fn
returning the C type corresponding to <type>,
body is a statement term for fn with loop flag L
returning T and affecting vars.
This represents a C assignment to
an element of a member of the structure represented by var
by value (i.e. using .) if var has structure type
or by pointer (i.e. using -> if var has pointer type,
using term1 as the index
with the new value expression represented by term2,
followed by the C code represented by body.
- A term (let ((var term)) body),
when var is assignable,
var is among vars
if it is a formal parameter of fn
that has pointer or array type if fn is non-recursive,
term is a statement term for fn with loop flag nil
returning void and affecting var
that is either a call of a target function
that precedes fn in the list of targets (t1 ... tp)
whose body term returns void and affects var
or an if whose test is an expression term returning boolean
(not a test (mbt ...)), and
body is a statement term for fn with loop flag L
returning T and affecting vars.
This represents the C code represented by term,
which may modify the variable represented by var,
followed by the C code represented by body.
- A term (mv-let (var var1 ... varn) (declarn term) body),
when the symbol name of var is a portable ASCII C identifier, the symbol name of var is distinct from
the symbol names of all the other ACL2 variables in scope,
each vari is assignable,
each vari is among vars
if it is a formal parameter of fn
that has pointer or array type if fn is non-recursive,
term is an expression term for fn
returning a non-void non-pointer non-array C type
and affecting the variables (var1 ... varn), and
body is a statement term for fn with loop flag L
returning T and affecting vars.
This represents, as indicated by the wrapper declarn,
a declaration of a C local variable represented by var,
initialized with the C expression represented by term,
followed by the C code represented by body.
Note that declarn stands for
declar1, declar2, etc.,
based on whether n is 1, 2, etc.;
this n is the number of
side-effected variables var1, ..., varn,
which is one less than the variables bound by the mv-let.
The C type of the variable is determined from the initializer.
- A term (mv-let (var var1 ... varn) (assignn term) body),
when var is assignable,
each vari is assignable,
each vari is among vars
if it is a formal parameter of fn
that has pointer or array type if fn is non-recursive,
term is an expression term for fn
returning the same non-void non-pointer non-array C type
as the C type of var
and affecting the variables (var1 ... varn), and
body is a statement term for fn with loop flag L
returning T and affecting vars.
This represents, as indicated by the wrapper assignn,
an assignment to
the C local variable or function parameter represented by var,
with the C expression represented by term as right-hand side,
followed by the C code represented by body.
Note that assignn stands for
assign1, assign2, etc.,
based on whether n is 1, 2, etc.;
this n is the number of
side-effected variables var1, ..., varn,
which is one less than the variables bound by the mv-let.
- A term (mv-let (var1 ... varn) term body),
when n > 1,
each vari is assignable,
each vari is among vars if it is a formal parameter of fn
that has pointer or array type if fn is non-recursive,
term is a statement term for fn with loop flag nil
returning void and affecting (var1 ... varn)
that is either a call of a target function
that precedes fn in the list of targets (t1 ... tp)
whose body term returns void and affects (var1 ... varn)
or an if whose test is an expression term returning boolean
(not a test (mbt ...), and
body is a statement term for fn with loop flag L
returning T and affecting vars.
This represents the C code represented by term,
which may modify the variables represented by var1, ..., varn,
followed by the C code represented by body.
- A call of a recursive target function fn0
that precedes fn in the list of targets (t1 ... tp),
on variables identical to its formal parameters,
when the C types of the variables are
the same as the C types of the formal parameters of fn0,
L is nil,
T is void,
vars is not nil,
and the body of fn0 is
a loop term for fn0 affecting vars.
This represents the C while statement
represented by the body of fn0, as explained below.
- A call of a non-recursive target function fn0
that precedes fn in the list of targets (t1 ... tp),
on pure expression terms for fn returning non-void C types,
when the C types of the terms are
the same as the C types of the formal parameters,
each term of pointer or array type is a variable
identical to the corresponding formal parameter of fn0,
L is nil,
T is void,
and the body of fn0 is
a statement term for fn0
returning void and affecting vars.
This represents an expression statement
whose expression is a call of the C function corresponding to fn0.
A loop term for fn
affecting vars,
where fn is a target function
and vars is a non-empty list of distinct symbols,
is inductively defined as one of the following:
- A call of if on
(i) a test of the form (mbt ...),
(ii) a `then' branch that is
a loop term for fn affecting vars, and
(iii) an `else' branch that may be any ACL2 term.
This represents the same C code represented by the `then' branch.
Both the test and the `else' branch are ignored,
because ATC generates C code under guard assumptions.
- A call of if on
(i) a test that is an expression term for fn returning boolean,
(ii) a `then' branch that is
a statement term for fn with loop flag t
returning void and affecting vars, and
(iii) an `else' branch that is
either the variable var when vars is the singleton (var),
or the term (mv var1 ... varn)
when vars is the list (var1 ... varn) with n > 1.
This represents the C while statement
whose controlling expression is represented by the test
and whose body is represented by the `then' branch;
the `else' branch represents no actual C code,
because it just serves to complete the if.
An expression term for fn
returning T and
affecting vars,
where fn is a target function,
T is a non-void C type,
and vars is a list of distinct symbols,
is inductively defined as one of the following:
- A pure expression term for fn returning T,
when vars is nil.
- A call of a non-recursive target function fn0
that precedes fn in the list of targets (t1 ... tp),
on pure expression terms for fn returning C types,
when the types of the terms are equal to
the C types of the formal parameters of fn0,
each term of pointer or array type is a variable
identical to the corresponding formal parameter of fn0,
and the body of fn0 is
a statement term for fn0
returning T and affecting vars.
This represents a call of the corresponding C function.
A pure expression term for fn returning T,
where fn is a target function and
T is a non-void C type,
is inductively defined as one of the following:
- A formal parameter of fn,
when T is the type of the formal parameter.
This represents
either the corresponding C formal parameter (as an expression),
or the corresponding C external object (as an expression),
as explained earlier.
- A variable introduced by let with declar
(as described above),
when T is the type of the variable.
This represents the corresponding C local variable
(as an expression).
- A call of a function <type>-<base>-const on a quoted integer,
when <type> is among
- sint
- uint
- slong
- ulong
- sllong
- ullong
<base> is amongT is the C type corresponding to <type>
and the quoted integer is non-negative and in the range of T.
This represents a C integer constant
of the C type indicated by the name of the function,
expressed in decimal, octal, or hexadecimal base. - A call of a function <op>-<type> on
a pure expression term for fn returning U,
when <op> is among<type> is among
- schar
- uchar
- sshort
- ushort
- sint
- uint
- slong
- ulong
- sllong
- ullong
T is the C type corresponding to
the return type of <op>-<type>
and U is the C type corresponding to <type>.
This represents the C operator indicated by the name of the function
applied to a value of the type indicated by the name of the function.
The guard verification requirement ensures that
the operator yields a well-defined result.
These functions covers all the C unary operators
(using the nomenclature in [C]). - A call of a function <op>-<type1>-<type2>
on pure expression terms for fn returning U and V,
when <op> is among
- add
- sub
- mul
- div
- rem
- shl
- shr
- lt
- gt
- le
- ge
- eq
- ne
- bitand
- bitxor
- bitior
<type1> and <type2> are among- schar
- uchar
- sshort
- ushort
- sint
- uint
- slong
- ulong
- sllong
- ullong
T is the C type corresponding to
the return type of <op>-<type1>-<type2>,
U is the C type corresponding to <type1>, and
V is the C type corresponding to <type2>.
This represents the C operator indicated by the name of the function
applied to values of the types indicated by the name of the function.
The guard verification requirement ensures that
the operator yields a well-defined result.
These functions covers all the C strict pure binary operators;
the non-strict operators && and ||,
and the non-pure operators =, +=, etc.,
are represented differently. - A call of a function <type1>-from-<type2>
on a pure expression term for fn returning U,
when <type1> and <type2> are among
- schar
- uchar
- sshort
- ushort
- sint
- uint
- slong
- ulong
- sllong
- ullong
and also differ from each other,
T is the C type corresponding to <type1>
and U is the C type corresponding to <type2>.
This represents
a cast to the type indicated by the first part of the function name.
The guard verification requirement ensures that
the conversion yields a well-defined result.
Even though conversions
happen automatically in certain circumstances in C,
these functions always represent explicit casts;
implict conversions are represented implicitly,
e.g. via the function for a unary operator that promotes the operand. - A call of <type>-read
on a pure expression term for fn returning U,
when <type> is among
- schar
- uchar
- sshort
- ushort
- sint
- uint
- slong
- ulong
- sllong
- ullong
T is the C type corresponding to <type>,
and U is the pointer type to T.
This represents the application of the indirection operator *
to the expression represented by the argument of <type>-read. - A call of <type>-array-read
on pure expression terms for fn returning U and V,
when <type> is among
- schar
- uchar
- sshort
- ushort
- sint
- uint
- slong
- ulong
- sllong
- ullong
T is the C type correponding to <type>,
U is the array type of element type T, and
V is a C integer type.
This represents an array subscripting expression.
The guard verification requirement ensures that
the array access is well-defined. - A call of struct-<tag>-read-<member>
on a pure expression term for fn returning U
when <tag> is a defstruct name,
<member> is the name of
one of the members of that defstruct,
<member> has an integer type in the defstruct,
T is the C integer type of <member>, and
U is the C structure type represented by <tag>
or the pointer type to that C structure type.
This represents an access to a structure member,
by value if U is the C structure type
(i.e. using .)
or by pointer if U is the pointer type to the C structure type
(i.e. using ->).
- A call of struct-<tag>-read-<member>-element
on pure expression terms for fn returning U and V
when <tag> is a defstruct name,
<member> is the name of
one of the members of that defstruct,
<member> has an integer array type in the defstruct,
T is the C element type of the array type of <member>,
U is a C integer type, and
V is the C structure type represented by <tag>
or the pointer type to that C structure type.
This represents an access to an element of a structure member,
by value if V is the C structure type
(i.e. using .)
or by pointer if V is the pointer type to the C structure type
(i.e. using ->).
- A call of sint-from-boolean on
an expression term for fn returning boolean
that is a call of and or or
(not a call of any boolean-from-<type>),
when T is int.
This converts an expression term returning boolean
to a pure expression term returning int.
The restriction of the argument term of sint-from-boolean
to be a call of and and or is so that
such a call always represents the C int 0 or 1,
which is what sint-from-boolean returns;
the conversion from boolean to int is only in ACL2,
not present in the C code, which just has the represented expression.
- A call of condexpr on
a call of if on
(i) a test that is an expression term for fn returning boolean and
(ii) branches that are
pure expression terms for fn returning T,
where T is not
(signed or unsigned) char or short.
This represents a C ?: conditional expression
whose test expression is represented by the test term
and whose branch expressions are represented by the branch terms.
The restriction that T is the same for both branches
and that it is not (signed or unsigned) char or short
is so that the C ?: operator
does not perform conversions on the branches,
which would have to be represented explicitly
in the ACL2 code that represents the C code.
An expression term for fn returning boolean,
where fn is a target function,
is inductively defined as one of the following:
- A call of a function boolean-from-<type>
on a pure expression term for fn returning U,
when <type> is among
- schar
- uchar
- sshort
- ushort
- sint
- uint
- slong
- ulong
- sllong
- ullong
U is the C type corresponding to <type>.
This converts a pure expression term returning a C type
to an expression term returning boolean. - A call of one of the following macros
on expression terms for fn returning booleans:This represents the corresponding C logical operator
conjunction && or disjunction ||);
conjunctions and disjunctions are represented non-strictly.
The C type of a variable var is defined as follows:
- If var is a formal parameter
that does not represent an access to an external object,
the C type is the one derived from the guard as explained earlier.
- If var is a formal parameter
that represents an access to an external object,
the C type is derived from
the definition of the recognizer object-<name>-p
generated by defobject for that object.
This may be an integer type or an integer array type;
see defobject.
- If var is not a formal parameter,
it must be introduced by
a let with declar
or an mv-let with declarn,
and its C type is the one of the term argument of
declar or declarn.
The C type of a variable is never void.
A variable var bound in
a let or mv-let in a target function fn
is assignable when it is in scope,
i.e. it is identical to a function parameter
or to a variable bound in an enclosing let or mv-let,
and additionally any of the conditions given below holds.
The conditions make reference to the C scopes
represented by the ACL2 terms that
the let or mv-let is part of:
there is a C scope for the whole file,
then each function body is a nested C scope,
and then each if branch whose test is
an expression term returning a boolean
(i.e. whose test is not mbt
is a further nested C scope.
The conditions are the following:
- The function parameter or outer variable
is in the same C scope where var occurs.
- The variable var is among vars,
i.e. it is being affected.
- No variables are being affected, i.e. vars is nil.
Any of these conditions ensures that, in the ACL2 code,
the old value of the variable cannot be accessed after the new binding:
(i) if the first condition holds,
the new binding hides the old variable;
(ii) if the second condition holds,
it means that the outer binding will receive the value
at the end of the changes to the variables; and
(iii) if the third condition holds,
there is no subsequent code because there is no change to the variables.
These justify destructively updating the variable in the C code.
Statement terms represent C statements,
while expression terms represent C expressions.
The expression terms returning booleans return ACL2 boolean values,
while the statement terms,
including expression terms returning C values,
return ACL2 values that represent C values:
the distinction between boolean terms and other kinds of terms
stems from the need to represent C's non-strictness in ACL2:
C's non-strict constructs are
if statements,
?: expressions,
&& expressions, and
|| expressions;
ACL2's only non-strict construct is if
(which the macros and and or expand to).
Pure expression terms returning C values
represent C expressions without side effects;
C function calls may be side-effect-free,
but in general we do not consider them pure,
so they are represented by expression terms returning C values
that are not pure expression terms returning C values.
Expression terms returning booleans are always pure;
so they do not need the explicit designation `pure'
because they are the only expression terms returning booleans
handled by ATC.
Recursive ACL2 functions represent C loops,
where those recursive functions are called.
The loop flag L serves to ensure that
the body of a loop function ends with a recursive call
on all paths (i.e. at all the leaves of its if structure,
and that recursive calls of the loop function occur nowhere else.
The body of the C function represented by each non-recursive fn
is the compound statement consisting of
the block items (i.e. statements and declarations)
represented by the ACL2 function's body
(which is a statement term).
The restriction that each function fn may only call
a function that precedes it in the list of targets (t1 ... tp),
means that no (direct or indirect) recursion is allowed in the C code
and the target functions must be specified
in a topological order of their call graph.
The guard verification requirement ensures that
the constraints about C types described above holds,
for code reachable under guards.
Code unreachable under guards is rare but possible.
In order to generate C code that is always statically well-formed,
ATC independently checks the constraints about C types.
Translated Terms
The description of the representation of C code in ACL2 above
talks about untranslated terms, but ATC operates on translated terms, since it looks at unnormalized bodies of ACL2 functions.
This section describes how
the untranslated terms mentioned above
appear as translated terms:
these are the patterns that ATC looks for.
An untranslated term (and a b) is translated to
(if a b 'nil)
An untranslated term (or a b) it translated to
(if a a b)
An untranslated term (mbt x) is translated to
(return-last 'acl2::mbe1-raw 't x)
Since mbt$ expands to mbt, the former can be used too.
An untranslated term (mv var1 ... varn) is translated to
(cons var1 (cons ... (cons varn ' nil)...))
An untranslated term (let ((var (declar term))) body)
is translated to
((lambda (var) body) (declar term))
An untranslated term (let ((var (assign term))) body)
is translated to
((lambda (var) body) (assign term))
An untranslated term (let ((var term)) body)
is translated to
((lambda (var) body) term)
An untranslated term
(mv-let (var var1 ... varn) (declarn term) body)
is translated to
((lambda (mv)
((lambda (var var1 ... varn) body)
(mv-nth '0 mv)
(mv-nth '1 mv)
...
(mv-nth 'n mv)))
((lambda (mv)
((lambda (*0 *1 *2 ... *n)
(cons (declar *0) (cons *1 ... (cons *n 'nil))))
(mv-nth '0 mv)
(mv-nth '1 mv)
...
(mv-nth 'n mv)))
term))
An untranslated term
(mv-let (var var1 ... varn) (assignn term) body)
is translated to
((lambda (mv)
((lambda (var var1 ... varn) body)
(mv-nth '0 mv)
(mv-nth '1 mv)
...
(mv-nth 'n mv)))
((lambda (mv)
((lambda (*0 *1 *2 ... *n)
(cons (assign *0) (cons *1 ... (cons *n 'nil))))
(mv-nth '0 mv)
(mv-nth '1 mv)
...
(mv-nth 'n mv)))
term))
An untranslated term
(mv-let (var1 var2 ... varn) term body)
is translated to
((lambda (mv)
((lambda (var1 ... varn) body)
(mv-nth '0 mv)
(mv-nth '1 mv)
...
(mv-nth 'n-1 mv)))
term)
Since ATC operates on translated terms,
there is no direct restriction
on the untranslated bodies of the functions,
which in particular may use any macro or any named constant,
so long as their translated form
satisfies all the requirements stated in this ATC documentation;
the restrictions on the translated bodies thus impose
indirect restrictions on the untranslated bodies.
Note also that ATC treats, for instance,
a translated term of the form (if a b 'nil)
as if it were the translation of (and a b)
in an untranslated function body,
even though that untranslated function body
may include (if a b 'nil) directly,
or some other macro that expands to that:
this does not cause any problem,
because if two untranslated terms become the same translated term,
then they are equivalent for ATC's purposes.
Applicability Conditions
In addition to the requirements on the inputs stated above,
the following applicability conditions must be proved.
The proofs are attempted when ATC is called.
No hints can be supplied to ATC for these applicability conditions;
(possibly local) lemmas must be provided before calling ATC
that suffice for these applicability conditions
to be proved without hints.
:natp-of-measure-of-fn
The measure of the recursive target function fn
must yield a natural number:
(natp <fn-measure>)
where <fn-measure> is the measure of fn.
There is one such applicability condition
for each recursive target function fn.
These applicability conditions do not need to be proved
if :proofs is nil.
Generated Events
Constant
ATC generates an event
(defconst *program* ...)
where ... is the abstract syntax tree of the generated C files,
which ATC also pretty-prints and writes
to the path(s) specified by
the :output-dir and :file-name inputs.
If the :proofs input is nil,
this constant is not generated.
Theorems
ATC generates an event
(defthm *program*-well-formed ...)
where ... is an assertion about *program* stating that
the generated (abstract syntax tree of the) files
is statically well-formed,
i.e. it compiles according to [C].
If the :proofs input is nil,
this theorem is not generated.
For each target function fn, ATC generates an event
(defthm *program*-fn-correct ...)
where ... is an assertion about fn and *program*
stating that,
under the guard of fn,
executing the C dynamic semantics on
the C function generated from fn
yields the same result as the function fn.
That is,
the C function is functionally equivalent to the ACL2 function.
If the ACL2 function takes arrays or pointers as inputs,
the generated correctness theorem includes hypotheses
saying that the pointed objects are all at different addresses.
The formal model of C that the proofs rely on
assumes that objects do not overlap.
Thus, the guarantees provided by the generated theorems about the C code
hold only if pointers to distinct, non-overlapping objects
are passed to the generated C functions.
If the :proofs input is nil,
this theorem is not generated.
Generated C Code
ATC generates a single source file,
optionally accompanied by a header,
as explained in Section `Representation of C Code in ACL2'.
The guard verification requirement stated earlier for each function
implies that the generated C operations
never exhibit undefined behavior,
provided that they are called with values
whose ACL2 counterparts satisfy the guard of the function.
Compiling and Running the C Code
The generated C code can be compiled and run as any other C code.
For instance, one can use gcc on macOS or Linux.
As mention in the description of the :header input above,
if a header is generated, external code should include the header,
and the generated source file should be compiled and linked
with the external code to obtain a full application.
To test the generated code,
one can write a separate source file with a main function,
and compile all files together.
By default, an executable a.out is generated
(if using gcc on macOS or Linux).
The community books directory kestrel/c/atc/tests
contains some examples of C code generation
and handwritten C code to test the generated code.
Redundancy
A call of ATC is redundant if and only if
it is identical to a previous successful call of ATC.
Subtopics
- Atc-implementation
- Implementation of atc.
- Atc-tutorial
- ATC tutorial.