Homework 11
Due 4/20/2012 [start of section]
Problem 1
Sun's network file system (NFS) protocol provides reliability via:
- at-most-once semantics
- at-least-once semantics
- two-phase commit
- transactions
Problem 2
Which is the best network on which to implement a remote-memory read
that sends a 100 byte packet from machine A to machine B and then sends a
8000 byte packet from machine B to machine B?
- A network with 200 microsecond overhead, 10 Mbyte/s bandwidth,
20 microsecond latency
- A network with 20 microsecond overhead, 10 Mbyte/s bandwidth, 200
microsecond latency
- A network with 20 microsecond overhead, 1 Mbyte/s bandwidth, 2
microsecond latency
- A network with 2 microsecond overhead, 1 Mbyte/s bandwidth,
20 microsecond latency
Problem 3
In class, we discussed the fact that, if messages can be lost, it is impossible
to devise an algorithm that guarantees that two nodes can agree to do the
same thing at the same time (the two generals problem).
However, weaker forms of agreement may be possible.
Suppose two nodes, A and B, communicate via messages and that the
probability of receiving any message that is sent is P (0 < P < 1 ). You need
not consider any other types of failures.
-
Is it possible for A and B to agree with certainty to perform some action (but
not necessarily perform it at the same time)? If not, explain why not. If so,
describe a protocol that provides this guarantee.
-
Is it possible for both nodes to agree to do the same thing at the same time
with >99.99999% certainty (e.g. guarantee that there is less than a 0.0000
1% risk that one or both will fail to make the appointment)? If not, explain
why not. If so, describe a protocol that provides this guarantee.
-
Suppose that in addition to lost messages, either A or B may crash at any
time and, once crashed, recover at some arbitrary time in the future. Is it
possible for A and B to agree with certainty to perform some action (but not
necessarily perform it at the same time)? If not, explain why not. If so,
describe a protocol that provides this guarantee
Problem 4
Suppose a server workload consists of
network clients sending 128-byte requests to a server which reads a
random 50KB chunks from a server's file system and transmits that
50KB to the client. The server's file system is able to cache all
metadata, so that each read consists of a single 50KB sequential read
from a random location on disk. The server may have multiple disks
and multiple network interfaces.
Each disk rotates at 10000 RPM and takes 5
ms on an average random seek. There are on average 300 sectors per
track and each sector is 512 bytes (in actuality, the number of
sectors per track will vary, but we'll ignore that. We'll also
assume that each request is entirely contained in one track and that
each starts at a random sector location on the track.)
To access disk, the CPU overhead is 30
microseconds to set up a disk access. The disk DMAs data directly
to memory, so there is no CPU per-byte cost for disk accesses.
Each network interface has a bandwidth of
100 Mbits/s (that's Mbits not MBytes!) and there is a 4 millisecond
one-way network latency between a client and the server. The
network interface is full-duplex: it can send and receive at the
same time at full bandwidth. The CPU has an overhead of 100
microseconds to send or receive a network packet. Additionally,
there is a CPU overhead of .01 microseconds per byte sent.
- How many requests per second can each disk satisfy?
How many requests per second can each
network interface satisfy?
How many requests per second can the
CPU satisfy (assuming the system has a sufficient number
of disks and network interfaces?)
What is the latency from
when a client begins to send the request until it receives
and processes the last byte of the reply (ignore any queuing
delays).
Problem 5
Consider a distributed system where there is a file server and a number of
client machines. To provide concurrency control, the file system includes a
lock manager that issues locks to client machines upon requests. Locks can
be either shared or exclusive. Shared locks are useful only for file reads,
while exclusive locks are needed for file updates. The file server issues lock
to a given client with a timed leases, such that when the lease expires, the
lock is revoked and the client machine must re-apply to reacquire the lock.
Answer the following questions:
- Why are leases useful?
- Consider the following scenario in accessing a file F.
Machine | Request time: | Request type: |
Duration until release |
A | 00:00 | Shared | 05 |
B | 00:05 | Shared | 10 |
C | 00:08 | Exclusive | 02 |
D | 00:10 | Shared | 05 |
B | 00:14 | Exclusive | 05 |
A | 00:20 | Shared | 05 |
Assuming that a lease is given for 10 time units, that clients cache the files
for performance, that coherence is maintained by an update protocol, and
figures showing the four machines and the file server as blocks (see example
below), and identifying at each state transition which client machine holds
which lock, and the state of the cache at each client. A state transition
occurs when the state of the cache changes at one client, when a request is
received, when a lock is acquired or when a lock is released.
Time: 00:00
Machine A Lock: Shared Cache: File F |
Machine B Lock: None Cache: Empty |
Machine C Lock: None Cache: Empty |
Machine D Lock: None Cache: Empty |
- If an "invalidate'' protocol is used for coherence, would the efficiency of
the system increase or decrease? Why?
-
Same as (b), but assume that machine C fails 1 time unit after it acquires the
lock. Show the state transition diagrams as instructed in part (b). State
clearly and precisely what precautions should be taken in writing the code
that updates the file at machine C.
Problem 6
Suppose we run the following program, with the code in the first
column running on one machine in a distributed system and the code on
the right running on another machine. The distributed system provides
a set of shared files with some consistency model. Initially A and B
are both 0.
write(A, 1); // Write the value ``1'' to file A |
write(B, 1); // Write the value ``1'' to file B |
if(read(B) == 0) // read the value from file B |
if(read(A) == 0) // read the value from file A |
print ``A wins''; |
print ``B wins''; |
(a) What are the possible outputs assuming the system enforces {\em
linearizability}?
(b) For the program described in the previous question, what are the
possible outputs assuming the system enforces {\em causal consistency}?
(c) What are the possible outputs for the above program assuming
the system enforces {\em sequential consistency}?
Problem 7
Suppose a distributed file system implements linearizable
consistency using callbacks and does not use leases. Suppose that a
client $c1$ that is caching file $F$ becomes disconnected from the network. Which of the
following is true
-
(a) Other clients cannot read file $F$ until the client $c1$ reconnects
- (b) Other clients cannot write file $F$ until the client $c1$
reconnects
- (c) Other clients can write file $F$, but once $F$ has been written
by any client, no client can read $F$ until $c1$ conects.
- (d) More than one of the above
- (e) None of the above
Problem 8
Suppose programs on three machines are reading and writing
files using a file system that enforces causal
consistency.
Machine 1 runs the following code
i1 = 0;
while(true){
overwriteFile("/foo", i1);
i1++;
}
The function overwriteFile() replaces previous contents
of the file with the specified value.
Machine 2 runs the following code
while(true){
int i2 = readValueFromFile("/foo");
overwriteFile("/bar", i2);
}
Machine 3 runs the following code
int i2 = readValueFromFile("/bar");
int i1 = readValueFromFile("/foo");
Suppose that machine 3 reads the value ``10'' on the first read (of
i2). Which
of the following is true of the value that machine 3 reads on the
second read (of i1)? (If multiple items are true, choose the
most precise/restrictive of the options. I.e., i1 < 10 is more
precise/restrictive than i1 ≤ 10.)
- (a) i1 < 10
- (b) i1 ≤ 10
- (c) i1 = 10
- (d) i1 ≤ 10
- (e) i1 > 10
- (f) None of the above or more than one of the above (i.e., none of the above
choices includes all possible values for i1.)