cdr-
linked lists
In Lisp and Scheme, you don't typically string objects
together into a list by giving each one a "next" field that points
right to the next object. Instead, you create a list of pairs
whose car
fields hold the pointers to the objects, and whose
cdr
fields link the pairs together into a "spine."
There isn't really a special list
data type in Scheme. A list
is really just a sequence of pairs, ending with a null pointer. A
null pointer is a list, too--it's a sequence of zero pairs
ending in a null pointer. We sometimes talk about "the car of a list"
or "the cdr of a list," but what that really means is "the car
of the first pair in the list" and "the cdr of the first pair in
the list."
Suppose we have a variable foo
holding a pointer to a list
containing the integers 22
, 15
, and 6
.
Here's one way of drawing this situation.
+---------+ +---------+ +---------+ +---------+ | <PAIR> | | <PAIR> | | <PAIR> | foo | *----+--->+=========+ +-->+=========+ +-->+=========+ +---------+ | 22| / | 15| / | 6| +---------+ / +---------+ / +---------+ | *----+---+ | *----+-+ | * | +---------+ +---------+ +---------+
This shows something pretty close to the way things are likely to actually represented in memory. But there's usually a better way of drawing the list, which emphasizes the fact that number values are conceptually pointers to numbers, and which corresponds to the way we usually think about lists:
+---+ +---+---+ +---+---+ +---+---+ bar | *-+--->| * | *-+----->| * | *-+----->| * | * | +---+ +-+-+---+ +-+-+---+ +-+-+---+ | | | \|/ \|/ \|/ 22 15 6
I've left off the header fields of objects, which are not visible to a Scheme programmer.
I've also drawn pairs in a special way, with the car
and cdr
fields side-by-side. I've drawn the integers outside the pairs,
with pointers to them from the car
fields, because that's the
way things look at the language level.
This emphasizes the fact that lists are generally separate things from the items "in" the list.
A major advantage of this is that you don't have to modify an object to put it on a list--an object can easily be in many lists at once, because a list is really just a spine of pairs that holds pointers to the items in the list. This is much cleaner than the way people are typically taught to create simple lists in most beginning programming classes. (It's also very natural in a language where all values are pointers---of course lists of objects are really just lists of pointers to objects.)
For example, you can have two lists with the same elements, or some of the same elements, but perhaps in a different order.
+---+ +---+---+ +---+---+ +---+---+ bar | *-+--->| * | *-+----->| * | *-+----->| * | * | +---+ +-+-+---+ +-+-+---+ +-+-+---+ | | | \|/ \|/ \|/ 22 15 6 /|\ /|\ | | +---+ +-|-+---+ +-+-+---+ baz | *-+--->| * | *-+-------------------->| * | * | +---+ +---+---+ +---+---+
Here I've drawn two lists, bar
and baz
---that is, lists
that are the values of the variables bar
and baz
.
bar
holds the elements 22
, 15
, and 6
,
while baz
just holds the elements 22
and 6
.
Since these two lists are really just made up of pairs, and they're different pairs, we can modify one list without modifying the other, and without modifying the objects "in" the lists. For example, we can reverse the order of one of the lists without affecting the other.
(We also don't have to create a special kind of list node that has two next fields, so that something can be in two lists at a time. We can just have two separate lists of pairs, or three or four.)
Scheme has a standard way of writing a textual representation of
a list. Given the pictured situation, evaluating the expression
(display bar)
will print (22 15 6)
. Evaluating
the expression (display baz)
will print (22 6)
.
Notice that Scheme just writes out a pair of parentheses around
the items in the list--it doesn't represent the individual
pairs, but just their car
values.
Dynamic typing also helps make lists useful. A list of pairs can hold any type of object, or even a mixed bag of different types of objects. So, for example, a pair list can be a list of integers, a list of lists, a list of text characters, or a list of any of the kinds of objects we haven't gotten to yet. It can also be a mixed list of integers, other lists, and whatnot. A few list routines can therefore be useful in a variety of situations--a single list search routine can search any kind of list for a particular target object, for example.
This picture shows two variable bindings, for the variables bar
and foo
. bar
's binding holds a list (10 15 6)
,
while foo
's holds a list (22 15 6)
. We say that these
lists share structure, i.e., part of one list is also part of
the other.
+-------- + +---------+ | <PAIR> | bar | *----+--->+=========+ +---------+ | 10| +---------+ | *----+-+ +---------+ \ \ +---------+ \ +---------+ +---------+ +---------+ | <PAIR> | \ | <PAIR> | | <PAIR> | foo | *----+--->+=========+ +-->+=========+ +-->+=========+ +---------+ | 22| / | 15| / | 6| +---------+ / +---------+ / +---------+ | *----+---+ | *----+-+ | * | +---------+ +---------+ +---------+
This picture may correspond well to how things are represented in memory, but it's a little confusing.
The more common way of drawing this data structure is
+---+ +---+---+ bar | *-+--->| * | *-+------+ +---+ +-+-+---+ | | | \|/ | 10 | \|/ +---+ +---+---+ +---+---+ +---+---+ foo | *-+--->| * | *-+----->| * | *-+----->| * | * | +---+ +-+-+---+ +-+-+---+ +-+-+---+ | | | \|/ \|/ \|/ 22 15 6
Again, this emphasizes the idea that everything's a pointer--conceptually, the pairs hold pointers to the integers.
Generally, pairs are drawn as pairs of little boxes, and they're typically
drawn with the boxes side by side--that's just handy because pairs are
often used for linear lists, which you want to display horizontally--it's
easy to draw the spine of the list horizontally if the cdr
field
(used as a "next" pointer) is on the right. (Of course, when there's
shared structure, as in the above picture, you can't draw all cdrs going
directly to the right.)
We leave off the headers because they're a low-level detail anyway, because they're a hidden implementation detail that may vary from system to system, and because Scheme programmers immediately recognize this kind of two-box drawing of a pair.
In the above picture, we can talk about "the car of foo
", which
really means the value in the car
field of the pair pointed
at by the value stored in (the binding of) foo
. It's (a pointer
to) 10
. We would often call this "the car of the list foo
."
Notice that the cdr of foo
is also a list, and it's the
same list as the cdr of bar
---the cdrs of the first pairs in
each list point to the same list.
We can say that the cdr of foo and the cdr of bar "are eq?
,"
because the expression (eq? (cdr foo) (cdr bar))
returns
true. That is, (car foo)
and (cdr foo)
return
(pointers to) exactly the same object.