EWD316: A Short Introduction to the Art of Programming
by
prof.dr.Edsger W.Dijkstra
August 1971
0. Contents
1. Preface
2. Some Fundamental Notions
3. Programming Languages and their Implementation
4. Variables and relations between their values
5. Programs corresponding to recurrence relations
6. A first example of step-wise program composition
7. The shortest spanning subtree of a graph
8. The towers of Hanoi
9. The problem of eight queens
10. A rearranging routine
10. A rearranging routine
The following example has been inspired by the work of C.A.R.Hoare (Algorithm 64, C.A.C.M.).
The original problem was to rearrange the values of the elements of a given array A[1:N] and a given value of f (1 ≤ f ≤ N) such that after the rearrangement
for 1 ≤ k < f A[k] ≤ A[f] | |
for f < k ≤ N A[k] ≥ A[f] . | (1) |
As a result of this rearrangement A[f] equals the f-th value in order of non-decreasing magnitude. We call the array rearranged satisfying (1) "split around f"; we call the final value of A[f] "the splitting value". When the array has been split it is divided into two halves, the one half the "left-hand" half, say containing the "small" values and the other half the "right-hand" half, say containing the large values, with the splitting value sandwiched in between. The overall function of the algorithm is to move small values to the left and large values to the right. The difficulty is that for given f the final value of A[f], i.e. our criterion "small/large", is unknown to start with.
Hoare's invention is the following. Select some rather arbitrary criterion "small/large"; by moving small elements to the left and large elements to the right, a split will be established somewhere, around some position. If s happens to turn out = f, the original problem has been solved. The kernel of Hoare's invention is the observation that in the other cases the original problem can be reduced to the same problem, but now applied to one of the halves, viz. to the left-hand half if f lies to the left of the split and to the right-hand half if f lies to the right of the split.
Note. An alternative approach would be to sort the array completely: after that A[f] will equal the f-th value in the order of non-decreasing magnitude. But this can be expected to be rather expensive, for then we have established relations (1) for all values of f. As a matter of fact we will arrive at a rearranging routine which itself can be used for complete sorting, on account of the fact that, when a split around s has been established, A[s] has the value it is going to have in the completely sorted array, and that because all elements to the left of it are ≤ A[s] and those to the right of it are &ge A[s] completely sorting it could now be performed by sorting thereafter the two parts independently.
We now focus our attention on the rearranging routine which is to cause a split in the array section
A[m] ... A[n] with 1 ≤ m ≤ n ≤ N
When we try to make such a routine we are immediately faced with the choice of our criterion "small/large". One of the ways is to select an arbitrary element from the section, to call all elements larger than it "large", all elements smaller than it "small" and all elements equal to it either "large" or "small", just what is most convenient (in the final arrangement they may occur everywhere, either at the split or at either of its two sides). Let us therefore postpone the choice in this discussion for a moment, as there is a chance that we can use our freedom to some advantage.
We are going to select one of the values in the array as the "splitting value"; having chosen this value, its final position, i.e. the position of the split, is unknown before the algorithm starts; it is defined when the algorithm has been executed, in other words it is determined by the evolution of the computation. This suggests that we build up the collection of the small values, starting at the left-hand end, and that of the large values at the right-hand end, and continue doing so until the two collections meet somewhere in the middle. To be more precise, we introduce two pointers, "i" and "j" say, whose initial values will be "m" and "n" respectively, and rearrange values in such a fashion that, when we call the splitting value V, we ensure that
A[k] ≤ V for m ≤ k < i and
A[k] ≥ V for j < k ≤ n .
Having chosen the splitting value, the algorithm will have the duty of building up the collections of small and large values respectively from the outside inwards. The algorithm can start scanning, at the left-hand end say, until a large element is encountered. If this occurs, this value has to be removed from the collection of small values, i.e. it has to be added to the collection of large elements. It is, as a matter of fact, the first element whose "largeness" the algorithm has established: as we have decided to build up the collections from the outside inwards, this large value has to be assigned to A[n] . As we would like this position in the array to be "free" i.e. available to receive this first large value the original value of A[n] can be taken out of the array at the beginning and can be chosen as the splitting value V.
That is, we initialize i = m and j = n and "take out" A[n] by assigning it to the variable V thereby initializing the situation where scanning starts at element A[i] , while "j" points to the "hole" just made. When the upward scan (under control of increasing "i") finds a large element, i.e. when for the first time A[i] > V, this value is placed in the hole, now leaving the hole in the place pointed to by "i". From then onwards a downward scan (under control of decreasing "j") can operate until a small element has been encountered which will be placed in the hole at position "i", leaving the hole in the place pointed to by "j". Such upward and downward scans have to succeed eachother alternately until i = j, i.e. until both point to the hole at the position around which the split has been effectuated, Finally the hole receives the value V which had been taken out at the beginning.
The above sketch gives an informal description of the essential features of the algorithm, it by no means settles the structure of the sequential program that will embody it.
I have tried a program in which the core consists of the program part for the upward scan followed by the program part for the downward scan. The first part consists of a loop with "i:= i + 1
" in the repeatable statement; the second part consists of a loop with "j:= j - 1
" in the repeatable statement. The two parts together then form the repeatable statement of an outer loop. This program became very ugly and messy, the reason being that termination may occur either because the upward scan or because the downward scan is on the verge of scanning the hole. The reasoning needed to establish that the program did terminate properly became tortuous.
On account of that experience I have tried the alternative approach, one loop in which a single execution of the repeatable statement decreases the difference "j - i" (i. e. the length of the unscanned array section) by 1 , by doing a step of the appropriate scan.
The decision to control the steps of both scans by the same repeatable statement calls for the introduction of another variable; as we have to distinguish between only two cases, a boolean variable suffices, "up" say, with the meaning:
up = true means: |
the algorithm is in the state of upward scan and j points to the hole |
up = false means: |
the algorithm is in the state of downward scan and i points to the hole. |
The initialization has to be extended with the assignment "up:= true
"; after the initialization the program continues with
"while i < j do perform the appropriate step"
In the course of the action "perform the appropriate step", the value of "up" has to change its value whenever the hole is filled and the scanning direction has to reverse. Without any further detours I arrived at the following procedure:
integer procedure split(real array a, integer value m, n);
begin integer i, j; real V; boolean up;
i:= m; j:= n; V:= a[j]; up:= true;
while i < j do
begin if up then
if a[i] > V do begin a[j]:= a[i]; up:= false end
else
if V > a[j] do begin a[i]:= a[j]; up:= true end;
if up then i:= i + 1 else j:= j - 1
end;
a[j]:= V; split:= j
end
In its applications we shall only call the procedure "split" with m < n; as it stands it also caters for the case m = n,
Exercise. Show that in a version of split
that only needs to cater for m < n, its internal repetition could have been controlled by a repeat until clause as well.
Note. At the price of a larger number of subscriptions to be performed, the text of the procedure can be shortened by not introducing the separate variable V, but by storing this value "in the hole", i.e.
V = if up then a[j] else a[i] .
As a result the splitting value zigzags to its final position. With the above convention the tests "a[i] > V
" and "V > a[j]
" both become "a[i] > a[j]
", the assignments "a[j]:= a[i]
" and "a[i]:= a[j]
" both become the interchange
W:= a[i]; a[i]:= a[j]; a[j]:= W
and the assignments "up:= false
" and "up:= true
" can both be represented by
up:= non up
The above considerations allow us to condense the procedure text into
integer procedure split(real array a, integer value m, n);
begin integer i, j; real W; boolean up;
i:= m; j:= n; up:= true;
while i < j do
begin if a[i] > a[j] do
begin W:= a[i]; a[i]:= a[j]; a[j]:= W; up:= non up end;
if up then i:= i + 1 else j:= j - 1
end;
split:= j
end. (End of Note.)
We now return to our original problem: given an array A[1:N] and a value f (1 ≤ f ≤ N), rearrange the elements in such a way that
for 1 ≤ i < f A[i] ≤A[f] and
for f < i ≤ N A[i] ≥A[f] ;
as a result A[f] will equal the f-th element in the order of non-decreasing magnitude.
The idea is to apply the operator "split" first to the original array from 1 through N. The operator establishes the split somewhere, position s say. If the position of the split coincides with f (f = s), we have reached our goal, otherwise the operator "split" is applied to one of the halves, viz. to the left-hand half when f < s and to the right-hand half when f > s etc.
For this purpose we introduce variables p and q, satisfying
1 ≤ p ≤ f ≤ q ≤ N
such that
A[p] ... A[q]
will be the section of the array to which the split will be applied, as this section is certain to contain the (future) value of A[f].
If the split is found to the right of f (i.e. f < s) the operator has to be applied to the left-hand half, i.e. q has to be reset to s - 1 and p can be left unchanged; in the case f > s, p has to be reset to s + 1 and q can be left unchanged. We thus arrive at the routine
integer p, q, s;
p:= 1; q:= N;
repeat s:= split(A, p, q);
if f < s do q:= s - 1;
if f > s do p:= s + 1
until f = s .
(Note. This program can call the routine "split" with m = n.)
We may wish to improve upon this program: it is rather superfluous to call the operator "split" with p = q: if the section consists of a single element no (significant) rearrangement can take place; the split will be around its single element and both halves will be empty. The relation p < q gives as therefore another necessary criterion of continuation, and we can look to see whether we can make it the sole criterion for continuation. Because we want to stick to p ≤ f ≤ q, the termination via the luck of hitting f with the split, i.e. f = s, has to generate p = f = q. The following program would achieve that result.
integer p, q, s;
p:= 1; q:= N;
while p < q do
begin s:= split(A, p, q);
if f = s then begin p:= f; q:= f end
else if f < s then q:= s - 1 else p:= s + 1
end .
From the above program text it is obvious that the operator "split" only be applied to sections of the array containing at least twa elements.
A more intricate use of the operator "split" is in complete sorting of the array, observing that after application of the operator "split" at least one element (viz. A[s]) has reached its final destination, while all other elements, although not necessarily in their final position, will be in the correct half, so that complete sorting then consists of sorting both halves independently.
The naive approach is a recursive
procedure sort( real array a, integer value p, q);
begin integer s;
s:= split(a, p, q);
if p < s - 1 do sort(a, p, s - 1);
if s + 1 < q do sort(a, s + 1, q)
end
such that the call
sort(A, 1, N)
will sort the entire array. Again it has been exploited that sorting an array section is only a necessary operation if the section contains at least two elements. (The routine "sort" may be called with a section of only one element, but it will not generate such calls itself.)
We have called the above procedure naive and we have done so for the following reasons. The operator "split" may divide the section offered to it into two very inequal parts (e.g. when the originally rightmost element had a near maximum value); as a result the maximum dynamic depth of recursive calls may grow proportionally to N, the length of the array section. As recursive calls require an amount of storage space proportional to the dynamic depth, the given program may turn out to be prohibitively demanding in its storage requirements. This would lead to the conclusion that recursive sorting is impractical, but for the fact that a slight rearrangement of the procedure "sort" ensures that the maximum dynamic depth will not exceed log2N. In view of the existence of such a sorting procedure we call the previous one "naive".
We can guarantee that a sorting routine will not generate a dynamic depth exceeding log2N, if whenever it has called "split", it will only prescribe a recursive call on itself for the sorting of the smallest of the two halves. (In the case that the two halves are of equal length, the choice is immaterial.) Applying "sort" recursively to the smallest half only will leave the other half unsorted, but this can be remedied by repeatedly applying this only half-effective sorting effort to the still unsorted section. In the body of "sort", two integers "pu" and "qu" are introduced, pointing to the left- and right-hand end of the still unsorted section.
procedure sort(real array a, integer value p, q);
begin integer s, pu, qu;
pu:= p; qu:= q;
while pu < qu do
begin s:= split(a, pu, qu);
if qu - s < s - pu then
begin if s + 1 < qu do sort(a, s + 1, qu); qu:= s - 1 end
else
begin if pu < s - 1 do sort(a, pu, s - 1); pu:= s + 1 end
end
end
Again, sort may be called with a section of a single element, but will not generate such calls itself.
Exercise. Prove that termination of the loop is guaranteed to take place with pu = qu. (This is less obvious than you might think!)
Note. If, to start with, the elements of array A are ordered according to non-decreasing magnitude, excessive depth of recursive calls has been prevented, but the algorithm remains time-consuming (proportional to N2). This has given rise to refinements of the procedure "split": instead of blindly taking the right-most element of the array section as splitting value, some sort of small search for a probably better approximation of the median value can be inserted at the beginning of "split"; this element can be interchanged with the rightmost element and thereafter split can continue as described.