Atj-tutorial-tests
ATJ tutorial: Generation of Tests.
This tutorial page describes the :tests option of ATJ,
which generates additional Java code to run tests specified by the user.
We illustrate this option via an example,
but also provide more general explanations.
While this tutorial page describes and exemplifies test generation for the deep embedding approach, most notions apply to the shallow embedding as well.
Defining Some Tests
Consider again the factorial function fact example
in atj-tutorial-deep.
Introduce a named constant as follows:
(defconst *tests*
'(("Test0" (fact 0))
("Test1" (fact 1))
("Test2" (fact 2))
("Test3" (fact 3))
("Test10" (fact 10))
("Test20" (fact 20))
("Test50" (fact 50))
("Test77" (fact 77))
("Test100" (fact 100))))
The name of the constant does not have to be *tests*.
Any name will do.
The above is a list of tests,
each of which is a doublet that consists of
a name (a string) and a ground call of the fact function.
The names in the list must be all distinct,
and may be in any order:
their purpose is to describe the associated tests
in a human-readable and identifiable way.
Each ground call in the list
specifies to check whether
executing the ground call in Java
yields the same result as executing it in ACL2.
The arguments of the ground call must be constants:
they must be or translate to quoted constants;
they cannot be just any terms that happen to be constant
(this requirement might be relaxed in the future).
Note that each such test does not explicitly specify
the expected result of the ground call.
At run time,
the test will simply compare the ACL2 result with the Java result.
Thus, the user can quickly define many tests
without specifying, or even knowing, the expected results.
Currently the names of the tests must be non-empty strings
that only contain (uppercase and lowercase) letters and digits.
This makes it easy to turn these names into (parts of) method names,
as explained below.
Supplying the Tests to ATJ
The tests defined above can be supplied to ATJ as follows:
(java::atj fact :deep t :guards nil :tests *tests*)
ATJ's :tests input is evaluated:
in the example above, ATJ thus receives
the list of doublets that *tests* evaluates to.
In general, one can supply any term as the :tests input,
so long as its evaluation yields
a true list of doublets in the format explained above.
For example, the quoted list that defines *tests*
could be supplied directly as the :tests input.
As another example, one could define
several named constants like *tests* above,
say *tests1*, *tests2*, etc.,
and supply (append *tests1* *tests2* ...) as :tests.
However, supplying a single named constant
(which may well be defined as the append of other constants,
each for a different group of tests)
may be the clearest approach.
The ground call in a test must be that of a target function.
Recall that, as described in atj-tutorial-translated,
the target functions are the ones explicitly specified to ATJ
(just fact in the example above).
Currently ATJ does not support tests that involve ground calls of
non-target functions directly or indirectly called by the target functions
(such as zp in the fact example above):
the rationale is that the target functions are the top-level ones,
and thus the ones to be tested.
This restriction can be relaxed if it turns out to be a burden.
Generated Test Code
As conveyed by the message shown on the screen by ATJ,
three Java files are generated, in the current directory.
The first two files, Acl2Code.java and Acl2CodeEnvironment.java,
are the same as before.
The third file, Acl2CodeTests.java, is new.
Opening the new file reveals that it contains
a single Java public class called Acl2CodeTests.
The file imports all the (public) AIJ classes,
which are in the edu.kestrel.acl2.aij Java package,
and a few classes from the Java standard library.
The class has a main() method,
and can be therefore run as a Java application.
The main() method accepts
either no inputs or a single integer input,
whose meaning is explained later.
After validating the input(s)
and calling Acl2Code.initialize()
(see atj-tutorial-deep for details on the latter),
the main() method
runs all the tests specified in :tests,
one after the other.
The class has a private method test_<name>()
for each test with name <name>;
more details on these private methods are given later.
After running all the test methods,
the main() method prints a summary message
saying whether there were test failures or not.
Compiling and Running the Code
All the files generated by ATJ can be compiled via
javac -cp [books]/kestrel/java/aij/java/out/artifacts/AIJ_jar/AIJ.jar \
Acl2Code.java Acl2CodeEnvironment.java Acl2CodeTests.java
where [books]/... must be replaced with
a proper path to the AIJ jar file
(see the documentation of AIJ for instructions on how to obtain that jar file.
After compiling, the code can be run, without argument, via
java -cp [books]/kestrel/java/aij/java/out/artifacts/AIJ_jar/AIJ.jar:. \
Acl2CodeTests
where again [books]/... must be replaced with a proper path.
As conveyed by the messages printed on the screen,
the tests are run one after the other, and they all pass.
The final message saying that all tests passed
is more useful when there is a large number of tests,
sparing the user from having to visually double-check every line.
Now try running the same code with a positive integer argument:
java -cp [books]/kestrel/java/aij/java/out/artifacts/AIJ_jar/AIJ.jar:. \
Acl2CodeTests 10
In addition to the messages printed before,
now 10 running times are reported for each test,
along with a minimum, an average, and a maximum.
The numeric argument indicates the number of times to run each test.
These tests run very quickly and thus it is likely that
all the reported time be 0.000 or perhaps just 0.001.
Adding and running longer tests such as (fact 10000)
will show more interesting numbers.
A Closer Look at the Test Methods
All the private static methods in the test class Acl2CodeTests,
each of which corresponds to one of the user-supplied tests,
have a very similar structure.
Each test method takes as input a non-negative integer,
which is the positive integer passed to the main() method, if any,
or 0 if no argument is passed to the main() method.
The value 0 means that no execution times should be measured and reported.
A positive value means that
execution times should be measured and reported,
with the positive value specifying how many times the test must be run.
As seen in the example above (when the value was 10),
the execution time of every run is measured and printed,
and minimum, average, and maximum are calculated
over those execution times.
Whether the test passes or not (aside from execution time considerations)
is determined by each method as follows.
The method first builds
the values of the arguments of the call being tested,
and the value of the expected result of the call.
This expected result value is determined by ATJ
when it processes the test in the :tests input:
ATJ invokes the ACL2 evaluator to obtain each result value,
e.g. it invokes the ACL2 evaluator on the term (fact 10)
to obtain the value 3,628,800,
and generates Java code, in the test method,
that builds the Java representation of that value.
For each repetition of the test,
the method calls the Java code for the function being tested
on the arguments, and compares the result with the expected one.
Each test method measures the execution time of each repetition
by calling System.currentTimeMillis() just before and just after
the call of the Java code generated for the function,
and by subtracting the two values.
Note that the arguments are built once before the loop
and stored into local variables, which are accessed at each call.
Thus we measure the real time,
which may be affected by various kinds of ``noise'',
in particular any other running processes.
Repeating the tests a number of times (via the numeric argument),
and perhaps by attempting to run the tests
with as few other processes as possible,
should mitigate the noise.
Previous: Guards in the Deep Embedding Approach
Next: Shallow Embedding Approach