Atc-tutorial-int-representation
ATC tutorial: ACL2 representation of the C int type and operations.
As stated in atc-tutorial-approach,
ATC recognizes, and translates to C,
a subset of ACL2 that represents C code quite directly.
In other words, there are representations of C constructs in ACL2,
which the ATC user must know so that they can invoke ATC
on ACL2 code of the appropriate form.
This tutorial page describes how the C int types and operations
are represented in ACL2;
examples of their use, and of C code generated from them,
are given in other pages.
C int Type and Operations
According to [C:6.2.5/5] and [C:5.2.4.2.1/1],
the ``plain'' int type consists of
signed integers in a range from -32767 or less to +32767 or more
(both numbers are inclusive).
The exact range depends on the C implementation, as detailed below.
The (C, not ACL2) representation of int values in memory,
which may be visible to the C code via access as unsigned char[]
(as allowed by [C]),
consists of a sign bit, some value bits, and optionally some padding bits
[C:6.2.6.2/2].
The signed representation may be
two's complement, ones' complement, or sign and magnitude
[C:6.2.6.2/2].
All these choices are implementation-dependent,
and determine the range of int values,
subject to the minimum range requirement stated above.
Two's complement representations without padding bits seem the most common,
along with 8-bit bytes
(the exact number of bits in a byte is also implementation-dependent
[C:5.2.4.2.1/1] [C:6.2.6.1/3]).
Under these assumptions, int values must consist of at least 16 bits,
resulting in at least the range from -32768 to +32767.
[C:6.2.6.1/4] requires int to take a whole number of bytes,
and thus the possible bit sizes are 16, 24, 32, 40, 48, etc.
[C:6.2.5/5] states that the size is
the natural one suggested by the architecture of the execution environment.
For modern Macs and PCs, experiments suggest this to be 32 bits
(the experiment consists in printing sizeof(int) in a C program),
even though one might expect it to be 64 bits instead,
given that these are 64-bit machines with 64-bit operating systems.
(However, the C long type appears to be 64 bits on these platforms.)
C provides a variety of int operations,
i.e. operations on int values.
These operations also apply to other arithmetic types,
but in this tutorial page we focus on the int type.
C provides the following unary and binary int operations [C:6.5]:
- + (unary) — no value change, but mirrors unary -
- - (unary) — arithmetic negation
- ~ (unary) — bitwise complement
- ! (unary) — logical negation/complement
- + (binary) — addition
- - (binary) — subtraction
- * (binary) — multiplication
- / (binary) — division
- % (binary) — remainder
- << (binary) — left shift
- >> (binary) — right shift
- < (binary) — less-than
- > (binary) — greater-than
- <= (binary) — less-than-or-equal-to
- >= (binary) — greater-than-or-equal-to
- == (binary) — equality
- != (binary) — non-equality
- & (binary) — bitwise conjunction
- ^ (binary) — bitwise exclusive disjunction
- | (binary) — bitwise inclusive disjunction
- && (binary) — logical (short-circuit) conjunction
- || (binary) — logical (short-circuit) disjunction
- = (binary) — simple assignment
- += (binary) — compound assignment with +
- -= (binary) — compound assignment with -
- *= (binary) — compound assignment with *
- /= (binary) — compound assignment with /
- %= (binary) — compound assignment with %
- <<= (binary) — compound assignment with <<
- >>= (binary) — compound assignment with >>
- &= (binary) — compound assignment with &
- ^= (binary) — compound assignment with ^
- |= (binary) — compound assignment with |
These not only take, but also return, int values.
This uniformity is also due to the fact that C represents booleans
as the int values 0 (for false) and 1 (for true),
and thus the relational and equality operations,
as well as the logical conjunction and disjunction operations,
all return int results [C:6.5.13] [C:6.5.14].
Some of the above operations yield well-defined results,
specified by [C], only under certain conditions.
For instance, the addition operation + on int operands
is well-defined only if the exact result is representable as an int
[C:6.5/5].
An implementation may actually add definedness to these operations,
by relying on the (well-defined) behavior of the underlying hardware,
e.g. by keeping the low bits of the exact result that fit int
(which is the same result prescribed by the Java language specification).
The && and || operations
are non-strict at the expression level:
their second operand expression is not executed
if the result of the first operand expression
suffices to determine the final result
(0 for conjunction, 1 for disjunction).
The simple and compound assignment operations have side effects.
All the other operations are pure, i.e. without side effect.
ACL2 Representation
The ACL2 representation of the C int type and operations
is in the community books kestrel/c/atc/integers.lisp
and kestrel/c/atc/integer-operations.lisp.
These are automatically included when ATC is included,
but one may want to include those file as part of an APT derivation
that refines some specification to the ACL2 subset handled by ATC
(see atc-tutorial-approach),
and thus before including ATC itself,
which is only needed to generate code at the end of the derivation.
In line with typical C implementations on Macs and PCs mentioned above,
our ACL2 representation of int values
consists of signed two's complement 32-bit integers.
More precisely, we provide a fixtype sint (for signed int),
with associated predicate sintp and fixer sint-fix.
This fixtype wraps
ACL2 integers in the range from -2^{31} to 2^{31}-1.
The wrapping provides more abstraction:
(the ACL2 representation of) C int values must be manipulated
only via the provided ACL2 functions (see below)
in the ACL2 code that ATC translates to C.
We plan to generalize our ACL2 representation of C int values
to allow different sizes than 4 (8-bit) bytes.
This will be achieved via a static parameterization,
i.e. via (something like) a constrained nullary function
that specifies the size of int.
We may also further generalize the representation,
again via a static parameterization,
to cover more of the options allowed by [C].
We also provide ACL2 functions
corresponding to some of the operations listed above:
These are all the strict pure operations;
the non-strict or non-pure operations are excluded,
because they are represented differently in ACL2,
as described elsewhere in this tutorial.
These ACL2 functions take sint values as inputs,
as required by their guards.
They also return sint values as outputs,
as proved by their return type theorems.
Some of these functions have additional guard conditions
that capture the conditions under which
the result is well-defined according to the [C].
For instance, the guard of add-sint-sint includes the condition that
the exact integer result fits in the range of the ACL2 integers
that are wrapped to form sint values.
More precisely, these additional guard conditions
are captured by the following predicates,
whose association to the above functions should be obvious from the names:
The predicates for / and % include
the condition that the divisor is not 0.
Besides unary and binary int operations,
C includes int constants [C:6.4.4.1]
(more precisely, integer constants, some of which have type int),
which may be regarded as (a large number of nullary) int operations.
Our ACL2 representation in community book
kestrel/c/atc/integers.lisp provides functions
sint-dec-const,
sint-oct-const, and
sint-hex-const
whose calls on suitable ACL2 quoted integer constants
represent decimal, octal, and hexadecimal int constants.
The quoted integer constant arguments must be
natural numbers in the range of signed two's complement 32-bit integers:
this is enforced by the guard of the three aforementioned functions.
Note that C integer constants are always non-negative.
See the documentation of the fixtype and functions mentioned above
for more details.
In the future, we may generalize our representation of these operations
to allow for implementation-defined behaviors,
via suitable static parameterizations.
Previous: ACL2 Proofs Generated for the Generated C code
Next: ACL2 representation and generation of C int programs