Gather-benchmarks
A satlink plugin for saving problems and storing them in
satlink-benchmarks directories.
This Satlink plugin can be used to automatically collect up all
problems given to Satlink into satlink-benchmarks directories. These
files might be useful for benchmarking or testing SAT solvers.
To use the plugin, simply include the following book:
(include-book "centaur/satlink/benchmarks" :dir :system)
and invoke Satlink as normal. For instance, if you have a collection of
ACL2::books that use gl in Satlink mode to prove some
theorems, then adding the above include-book should allow you to gather up
all of the SAT problems that GL solves during the course of certifying your
books.
This plugin saves SAT problems into the ./satlink-benchmarks directory,
relative to wherever you are running the tests. That is, if you many different
directories with their own GL/Satlink-based proofs, you'll get multiple
satlink-benchmarks directories spread out throughout your file system.
The resulting files are in DIMACS format and have ugly names like:
- xzsVtfY_tkY72RMcmmAkOw.sat
- Y5fvIdHKoF_5ZPLjF75gCg.unsat
The extension says whether the problem was satisfiable or unsatisfiable.
The file names are generated using an md5sum of the contents of the file.
Although they are ugly, this scheme (practically speaking) ensures that:
- We don't gather multiple copies of identical problems. That is, if we
certify and then re-certify the same books multiple times, the problems that GL
gives to Satlink will be the same each time, so we'll overwrite the previous
benchmarks with copies of themselves.
- We don't have to coordinate between multiple machines that are solving
benchmarks in parallel and writing to the same NFS-mounted directory. That is,
we never have to ask a question like: what's the next free file name?
- We do gather any variants of the same proof over time. That is, as
the design evolves, or as the spec changes, or as GL/ESIM/etc. are improved, GL
may end up giving Satlink different problems when proving the "same" theorem.
Since the new problems will have new checksums, the old benchmarks won't be
lost.
This hook is largely implemented in Perl, so you can customize its behavior
by editing the books/centaur/satlink/add-benchmark.pl script.