Search-engine friendly clone of the ACL2 documentation.

Atj-tutorial

Atj-tutorial-tests

ATJ tutorial: Generation of Tests.

This tutorial page describes the :tests option of ATJ, which generates additional Java code to run tests specified by the user. We illustrate this option via an example, but also provide more general explanations. While this tutorial page describes and exemplifies test generation for the deep embedding approach, most notions apply to the shallow embedding as well.

Defining Some Tests

Consider again the factorial function fact example in atj-tutorial-deep. Introduce a named constant as follows:

(defconst *tests*
  '(("Test0" (fact 0))
    ("Test1" (fact 1))
    ("Test2" (fact 2))
    ("Test3" (fact 3))
    ("Test10" (fact 10))
    ("Test20" (fact 20))
    ("Test50" (fact 50))
    ("Test77" (fact 77))
    ("Test100" (fact 100))))

The name of the constant does not have to be *tests*. Any name will do.

The above is a list of tests, each of which is a doublet that consists of a name (a string) and a ground call of the fact function. The names in the list must be all distinct, and may be in any order: their purpose is to describe the associated tests in a human-readable and identifiable way. Each ground call in the list specifies to check whether executing the ground call in Java yields the same result as executing it in ACL2. The arguments of the ground call must be constants: they must be or translate to quoted constants; they cannot be just any terms that happen to be constant (this requirement might be relaxed in the future).

Note that each such test does not explicitly specify the expected result of the ground call. At run time, the test will simply compare the ACL2 result with the Java result. Thus, the user can quickly define many tests without specifying, or even knowing, the expected results.

Currently the names of the tests must be non-empty strings that only contain (uppercase and lowercase) letters and digits. This makes it easy to turn these names into (parts of) method names, as explained below.

Supplying the Tests to ATJ

The tests defined above can be supplied to ATJ as follows:

(java::atj fact :deep t :guards nil :tests *tests*)

ATJ's :tests input is evaluated: in the example above, ATJ thus receives the list of doublets that *tests* evaluates to. In general, one can supply any term as the :tests input, so long as its evaluation yields a true list of doublets in the format explained above. For example, the quoted list that defines *tests* could be supplied directly as the :tests input. As another example, one could define several named constants like *tests* above, say *tests1*, *tests2*, etc., and supply (append *tests1* *tests2* ...) as :tests. However, supplying a single named constant (which may well be defined as the append of other constants, each for a different group of tests) may be the clearest approach.

The ground call in a test must be that of a target function. Recall that, as described in atj-tutorial-translated, the target functions are the ones explicitly specified to ATJ (just fact in the example above). Currently ATJ does not support tests that involve ground calls of non-target functions directly or indirectly called by the target functions (such as zp in the fact example above): the rationale is that the target functions are the top-level ones, and thus the ones to be tested. This restriction can be relaxed if it turns out to be a burden.

Generated Test Code

As conveyed by the message shown on the screen by ATJ, three Java files are generated, in the current directory. The first two files, Acl2Code.java and Acl2CodeEnvironment.java, are the same as before. The third file, Acl2CodeTests.java, is new. Opening the new file reveals that it contains a single Java public class called Acl2CodeTests. The file imports all the (public) AIJ classes, which are in the edu.kestrel.acl2.aij Java package, and a few classes from the Java standard library.

The class has a main() method, and can be therefore run as a Java application. The main() method accepts either no inputs or a single integer input, whose meaning is explained later. After validating the input(s) and calling Acl2Code.initialize() (see atj-tutorial-deep for details on the latter), the main() method runs all the tests specified in :tests, one after the other. The class has a private method test_<name>() for each test with name <name>; more details on these private methods are given later. After running all the test methods, the main() method prints a summary message saying whether there were test failures or not.

Compiling and Running the Code

All the files generated by ATJ can be compiled via

javac -cp [books]/kestrel/java/aij/java/out/artifacts/AIJ_jar/AIJ.jar \
      Acl2Code.java Acl2CodeEnvironment.java Acl2CodeTests.java

where [books]/... must be replaced with a proper path to the AIJ jar file (see the documentation of AIJ for instructions on how to obtain that jar file.

After compiling, the code can be run, without argument, via

java -cp [books]/kestrel/java/aij/java/out/artifacts/AIJ_jar/AIJ.jar:. \
     Acl2CodeTests

where again [books]/... must be replaced with a proper path. As conveyed by the messages printed on the screen, the tests are run one after the other, and they all pass. The final message saying that all tests passed is more useful when there is a large number of tests, sparing the user from having to visually double-check every line.

Now try running the same code with a positive integer argument:

java -cp [books]/kestrel/java/aij/java/out/artifacts/AIJ_jar/AIJ.jar:. \
     Acl2CodeTests 10

In addition to the messages printed before, now 10 running times are reported for each test, along with a minimum, an average, and a maximum. The numeric argument indicates the number of times to run each test. These tests run very quickly and thus it is likely that all the reported time be 0.000 or perhaps just 0.001. Adding and running longer tests such as (fact 10000) will show more interesting numbers.

A Closer Look at the Test Methods

All the private static methods in the test class Acl2CodeTests, each of which corresponds to one of the user-supplied tests, have a very similar structure.

Each test method takes as input a non-negative integer, which is the positive integer passed to the main() method, if any, or 0 if no argument is passed to the main() method. The value 0 means that no execution times should be measured and reported. A positive value means that execution times should be measured and reported, with the positive value specifying how many times the test must be run. As seen in the example above (when the value was 10), the execution time of every run is measured and printed, and minimum, average, and maximum are calculated over those execution times.

Whether the test passes or not (aside from execution time considerations) is determined by each method as follows. The method first builds the values of the arguments of the call being tested, and the value of the expected result of the call. This expected result value is determined by ATJ when it processes the test in the :tests input: ATJ invokes the ACL2 evaluator to obtain each result value, e.g. it invokes the ACL2 evaluator on the term (fact 10) to obtain the value 3,628,800, and generates Java code, in the test method, that builds the Java representation of that value. For each repetition of the test, the method calls the Java code for the function being tested on the arguments, and compares the result with the expected one.

Each test method measures the execution time of each repetition by calling System.currentTimeMillis() just before and just after the call of the Java code generated for the function, and by subtracting the two values. Note that the arguments are built once before the loop and stored into local variables, which are accessed at each call. Thus we measure the real time, which may be affected by various kinds of ``noise'', in particular any other running processes. Repeating the tests a number of times (via the numeric argument), and perhaps by attempting to run the tests with as few other processes as possible, should mitigate the noise.

Previous: Guards in the Deep Embedding Approach

Next: Shallow Embedding Approach