NOTE: This transcription was contributed by Martin P.M. van der Burgt. Although its markup is incomplete, we believe it serves a useful purpose by virtue of its searchability and its accessibility to text-reading software. It will be replaced by a fully marked-up version when time permits. —HR
On one-sided smoothing of event sequences.
We consider a finite number of events Ei (i from
0 through N) occurring at moments ti (i > j ⇒ ti > tj).
The problem I found myself faced with was how to
define a continuous function f(t) that could be
interpreted as “the event frequency at moment t.”
(I encountered this problem when trying to discover
a strategy for deciding how much primary store
should be allocated to independent programs in a
multiprogrammed environment when for each program
a “target page fault frequency” had been decided.)
A possible definition of the event frequency is
in terms of the Dirac function:
|
N |
|
f(t) = |
Σ δ (t - ti) | (1)
|
|
i=0 |
|
which has the obvious advantage that it satisfies
|
+∞ |
|
|
∫ f’(t).dt = N + 1 | (2)
|
|
-∞ |
|
but the equally obvious disadvantage of severe
discontinuities, which makes the comparison of f0(t)
and f1(t) — two different frequencies corresponding
to different values of the parameter I could control —
rather difficult. Hence our desire to smooth the
data. The smoothing, however, had to be one-sided
in the sense that the smoothed value f’(t) may
at any moment t only depend on the set of
values t
i with t
i ≤ t. In order to make a
sensible choice among the wealth of possibilities
for f(t) we should impose upon f(t) as many
“sensible” constraints as we can think of. To
start with, property (2) seems a good choice.
The next one is that we should realize that
frequencies —as usually understood— have a
dimension “time
-1”, i.e. if we consider at moments
t’
i = at
i and define the corresponding frequency
f’(t) for that second event sequence, then
should hold. Poperty (3) is not in conflict with
requirement (2), for
+∞ | +∞ | +∞ |
∫ f’(t’).dt’ = | ∫ f’(at).d(at) = | ∫ a.f’(at).dt = |
-∞ | -∞ | -∞ |
| +∞ | |
| ∫ f(t).dt = N + 1 | |
| -∞ | |
Reasonable as requirement (3) may seem it is
(for finite sequences) too much to hope for: if t
0 = 0
the relation (3) will not hold for t
0 ≤ t ≤ t
1, for how
are we going to guess “a”? Therefore we must loosen
it and can only require (3) to hold asymptotically
for t > t
i with sufficiently large i.
Relation (2) however, still stands rather firmly. Because
in (2) we have only used
We could consider functions w
i(t) such that
|
w(t) = 0 for t < 0 | (4)
|
|
∞ |
|
and |
∫ w(t).dt = 1 | (5)
|
|
0 |
|
| N |
and replace (1) by f(t) = | Σ wi (t - ti) |
| i=0 |
| (6) |
where (4) does credit to the one-sidedness of the
required smoothing , (5) guarantees (2) and (6) makes
the whole definition invariant for shifting of the
origin. The question is whether we can choose the w
i
in such a way — satisfying (4) and (5) — as to
approximate (3) at least asymptotically.
A reasonable first guess — corresponding to fading
out of the past in an exponential way — for wi(t) is
with ki > 0:
|
wi(t) = 0 for t < 0 |
|
|
= ki.e-ki.t for t ≥ 0 | (7)
|
where we can try to adapt k
i — on account of past
history, if any — towards a better approximation of (3).
I called (7) a first guess, because we also wanted
a continuous function f(t) and with (6) and (7),
f(t) is discontinuous for any t = t
i. A continuous
f would follow from the proper linear compositum
— proper in view of (4), (5) and w
i(t) = 0 —
|
wi(t) = 0 for t < 0 |
|
|
= (e-ki.t - e-hi.t).(hi.ki)/(hi - ki) for t ≥ 0 | (8)
|
and now we are free in the choice of both h
i and k
i,
provided that they are both positive and different
from each other. I am severely tempted to
restrict myself to
|
hi = c * ki with c ≠ 1 and c > 0 | (9)
|
the reason being that both h
i and k
i are of
dimension “time
-1” and that I cannot forget
my target (3).
* *
*
In order to come to grips with the constant c
I studied (omitting subscripts “i”) according to (8) & (9)
|
w(t) = k . c/(c-1) . (e-kt - e-ckt) |
and tried to determine c so as to minimize the
maximum value of
(I wanted my ripples as smooth as possible.) As this
led to the inacceptable value c = 1, I took the
limit for c → 1 and found — and this is my
final suggestion —
|
w(t) = 0 for t < 0 |
|
|
= ki2t.e-kit for t ≥ 0 | (10)
|
| ∞ |
Also (10) has the property that | ∫ wi(t).dt = 1 |
| 0 |
* *
*
The next thing to decide is, how to choose
the value ki. A constant value for ki seems
in view of (9) out of the question, because ki
has the dimension of a frequency. Apart from
initial difficulties it seems reasonable to choose
ki proportional to f(ti) i.e.
To get some feeling for a reasonable value
for d, we consider f(t) for t≥0 as results
from an
infinite sequence of events that have
occurred at t=0, -1, -2, -3, .... etc; for all passed
events we may assume the same value k
i = k
and we find for f(t), according to (6) and (10)
| ∞ |
f(t) = k. | Σ k.(t+i).e-k.(t+i) |
| i=0 |
which, indeed satisfies f(0) = f(1) as was to
be expected.
Note. I summed the sequence and found
| k2.e-k.t e-k
|
f(t) = | (————) . (t + ————)
|
| 1 - e-k 1 - e-k
|
(End of note)
The functions (10) have all one maximum,
viz. at t = ki-1. We would like to attribute
the maximum of f(t) for 0 ≤ t ≤ 1 to the term
with i = 0 (i.e. the last event). For i > 0 all the
terms should be decreasing functions of t for t > 0.
We can expect the smoothest maximum of f(t)
if it is in the neighbourhood of t = ½ and
this suggests k in the neighbourhood of 2 and
For, because in the average f(t) = 1, we know
that f(0) = f(1) < 1, therefore with d = 2 we
shall find k < 2 and the term with i = 0
will reach its maximum at t
0 > ½. Because
the contribution of the remaining terms is a
decreasing function of t, this will be compensated
and f(t) will find its maximum at t
f < t
0.
Note. Using I sliderule I have investigated the
consequences of the choice d = 2. Salvo errore et
omissione I found for k=1.616, f(0) = f(1) = 0.808
as minimum value for f(t) and f(0.41) = 1.105 as
maximum value. Anyone that would like to rely on
these figures should reestablish them in a more
reliable manner, I only mention these results because
they represent the outcome of an hour’s work and give
some impression of how smooth f(t) becomes
in the case of equally spaced events. If it were
of importance, I would try slightly smaller values of
d as well. (End of note)
19th February 1975 | prof.dr.Edsger W. Dijkstra
|
Plataanstraat 5 | Burroughs Research Fellow
|
NUENEN - 4565 |
|
The Netherlands |
|
Transcribed by Martin P.M. van der Burgt
Last revision
2014-12-08
.