(bits-0-31 offset x acc) → *
This is about 2.8x faster than bits-between, according to the following loops (on fv-1):
(progn (gc$) ;; 25.759 seconds (time (loop for x fixnum from 1 to 50000000 do (bits-between 0 32 x))) ;; 8.935 seconds (gc$) (time (loop for x fixnum from 1 to 50000000 do (bits-0-31 0 x nil))))
The inner loop is unrolled 8 times. Unrolling 16 times was a slightly better, but the case explosion in the equivalence proof ended up causing ACL2 a lot of problems. Maybe a better rewriting strategy would solve this, but just opening the functions is fine for 8 unrolls and it still gives us most of the benefit.
Function:
(defun bits-0-31 (offset x acc) (declare (type unsigned-byte offset) (type (unsigned-byte 32) x)) "Partially unrolled version of @(see bits-between) that collects the bits from a 32-bit unsigned @('x') and adds @('offset') to each." (let ((__function__ 'bits-0-31)) (declare (ignorable __function__)) (b* (((when (eql x 0)) acc) (acc (bits-0-7 (+ offset 24) (the (unsigned-byte 32) (ash x -24)) acc)) (acc (bits-0-7 (+ offset 16) (the (unsigned-byte 32) (ash x -16)) acc)) (acc (bits-0-7 (+ offset 8) (the (unsigned-byte 32) (ash x -8)) acc))) (bits-0-7 offset x acc))))
Theorem:
(defthm bits-0-31-redef (implies (natp offset) (equal (bits-0-31 offset x acc) (append (add-to-each offset (bits-between 0 32 x)) acc))))