Major Section: PARALLELISM
Some function calls are on arguments whose evaluation time is long enough to warrant parallel evaluation, while others are not. A granularity form can be used to make appropriate restrictions on the use of parallelism.
For example, consider the Fibonacci function. Experiments have suggested that execution time can be reduced if whenever the argument is less than 27, a serial version of the Fibonacci function is called. One way to utilize this information is to write two definitions of the Fibonacci function, one serial and one parallel.
(defun fib (x) (declare (xargs :guard (natp x))) (cond ((or (zp x) (<= x 0)) 0) ((= x 1) 1) (t (binary-+ (fib (- x 1)) (fib (- x 2))))))We realize quickly that writing both of these function definitions is cumbersome and redundant. This problem can be avoided by using a(defun pfib (x) (declare (xargs :guard (natp x))) (cond ((or (zp x) (<= x 0)) 0) ((= x 1) 1) ((< x 27) (binary-+ (fib (- x 1)) (fib (- x 2)))) (t (pargs (binary-+ (pfib (- x 1)) (pfib (- x 2)))))))
granularity
declaration with a parallelism primitive. This form ensures
that a call is parallelized only if resources are available and the
granularity form evaluates to a non-nil
value at the time of the call.
Below is a definition of the Fibonacci function using a granularity form.
(defun pfib (x) (declare (xargs :guard (natp x))) (cond ((or (zp x) (<= x 0)) 0) ((= x 1) 1) (t (pargs (declare (granularity (>= x 27))) (binary-+ (pfib (- x 1)) (pfib (- x 2)))))))
A granularity form can reference an extra formal parameter that describes the
call depth of the function the user is parallelizing. Consider for example
the following parallel mergesort
function, based on Davis's Ordered Sets
library. It splits the data into symmetric chunks for computation, so we
increment the depth
argument during the recursive call on both the
car
and cdr
.
(include-book "finite-set-theory/osets/sets" :dir :system) (defun sets::pmergesort-exec (x depth) (declare (xargs :mode :program)) (cond ((endp x) nil) ((endp (cdr x)) (sets::insert (car x) nil)) (t (mv-let (part1 part2) (sets::split-list x nil nil) (pargs (declare (granularity (< depth 2))) (sets::union (sets::pmergesort-exec part1 (1+ depth)) (sets::pmergesort-exec part2 (1+ depth))))))))
A less intrusive method (i.e., not requiring an extra formal parameter like
the depth
argument just above), which however can be less efficient,
involves analyzing the data itself for structural properties. For example:
(defun some-depth-exceeds (x n) (declare (xargs :guard (natp n))) (if (atom x) nil (if (zp n) t (or (some-depth-exceeds (car x) (1- n)) (some-depth-exceeds (cdr x) (1- n))))))If you experiment with calls of(defun valid-tip (x) (declare (xargs :guard t)) (or (eq x 'A) (eq x 'T) (eq x 'C) (eq x 'G)))
(defun pvalid-tree (x) (declare (xargs :guard t)) (if (atom x) (valid-tip x) (pand (declare (granularity (some-depth-exceeds x 3))) (pvalid-tree (car x)) (pvalid-tree (cdr x)))))
pvalid-tree
, you are likely to find that
the ``speedup'' it provides over a corresponding serial version is, in fact,
a slowdown! The problem is likely that some-depth-exceeds
is an
expensive function to run repeatedly. Instead of the approach above, it is
often handy to add an extra formal parameter in order to allow for more
efficient granularity forms, as we have done above in the definition of
SETS::pmergesort-exec
.