Bitsets
Bitsets library: uses bitmasks to represent finite sets of (small)
natural numbers.
Introduction
In this library, sets of natural numbers are represented as natural numbers
by saying a is a member of the set X when (logbitp a X). For
instance, the set {1, 2, 4} would be represented as the number 22. In binary,
this number is 10110, and you can see that bits 1, 2, and 4 are each
"true".
This representation enjoys certain efficiencies. In particular, operations
like union, intersection, difference, and subset testing can be carried out in
a "word at a time" fashion with bit operations.
But bitsets are only appropriate for sets whose elements are reasonably
small numbers since the space needed to represent a bitset is determined by its
maximal element. If your sets contain extremely small numbers---less than
28-60, depending on the Lisp---then they can be represented as fixnums and the
bitset operations will be remarkably efficient.
Beyond this, bignums are necessary. Even though bignum operations generally
involve memory allocation and are much slower than fixnum operations, using
bignums to represent sets can still be very efficient. For instance, on 64-bit
CCL, bignums are represented as a header/data pair where the data is an array
of 32-bit words; operations like logand can smash together the two data
arrays a word at a time.
Let's define the density of a set as its cardinality divided by its
maximal element plus one (to account for zero-indexing). Under this
definition, the set {1, 2, 4} has a density of 60%, whereas {96, 97, 98} has a
density of 3%.
Without destructive updates, you probably can't do much better than bitsets
for dense sets. But bitsets are not good at representing sparse sets. For
instance, you would need 8 KB of memory just to represent the singleton set
{65536}.
The sparse bitsets library is an
alternative to bitsets that uses lists of (offset . data) pairs instead of
ordinary natural numbers as the set representation. This representation is
perhaps somewhat less efficient for dense sets, but is far more efficient for
sparse sets (e.g., {65536} can be represented with a single cons of fixnums
instead of an 8 KB bignum.) and still provides word-at-a-time operations in
many cases.
Bitset Operations
Convention:
- a, b, ... denote set elements
- X, Y, ... denote sets.
There is no such explicit bitsetp recognizer; instead we just use
natp. Similarly there is no separate bitset-fixing function because we
just use nfix.
See reasoning for some notes about how to prove theorems about bitset
functions.
Bitset Constructors
- (bitset-singleton a)
- Constructs the set {a}
- Execution: (ash 1 a)
- (bitset-insert a X)
- Constructs the set X U {a}
- Execution: (logior (ash 1 a) x)
- (bitset-list a b ...)
- Constructs the set {a, b, ...}
- Execution: repeated bitset-inserts
- (bitset-list* a b ... X)
- Constructs the set X U {a, b, ...}
- Execution: repeated bitset-inserts
- (bitset-delete a X)
- Constructs the set X - {a}
- Execution: (logandc1 (ash 1 a) x)
- (bitset-union X Y ...)
- Constructs the set X U Y U ...
- Execution: (logior x (logior y ...))
- (bitset-intersect X Y ...)
- Constructs the set X \intersect Y \intersect ...
- Execution: (logand x (logand y ...))
- (bitset-difference X Y)
- Constructs the set X - Y
- Execution: (logandc1 y x)
Inspecting Bitsets
- (bitset-memberp a X)
- Determine whether a is a member of the set X
- Execution: (logbitp a x)
- (bitset-intersectp X Y)
- Determines whether X and Y have any common members
- Execution: (logtest x y)
- (bitset-subsetp X Y)
- Determines whether X is a subset of Y
- Execution: (= 0 (logandc2 y x))
- (bitset-cardinality X)
- Determines the cardinality of X
- Execution: (logcount x)
Enumerating Bitset Members
- (bitset-members X)
- Constructs an ordinary ordered set with the elements of X.
- Expensive: must cons together all of the set elements.
- Important in reasoning about bitsets.
Subtopics
- Bitset-members
- (bitset-members x) converts a bitset into an ordinary, ordered
set.
- Bitset-insert
- (bitset-insert a x) constructs the set X U {a}.
- Bitset-delete
- (bitset-delete a x) constructs the set X - {a}.
- Bitset-intersectp
- (bitset-intersectp x y) efficiently determines if X and Y
have any common members.
- Bitset-intersect
- (bitset-intersect X Y ...) constructs the set X \intersect Y
\intersect ....
- Bitset-difference
- (bitset-difference x y) constructs the set X - Y.
- Bitset-subsetp
- (bitset-subsetp x y) efficiently determines if X is a subset
of Y.
- Bitset-union
- (bitset-union X Y ...) constructs the set X U Y U ....
- Bitset-memberp
- (bitset-memberp a x) tests whether a is a member of the set
X.
- Bitset-singleton
- (bitset-singleton a) constructs the singleton set {a}.
- Bitset-list
- Macro to construct a bitset from particular members.
- Bitset-cardinality
- (bitset-cardinality x) determines the cardinality of the set
X.
- Bitset-list*
- Macro to repeatedly insert into a bitset.
- Utilities
- Utility functions.