Abstract:
The overall correctness of large-scale systems composed of many groups
of replicas executing BFT protocols scales poorly with the number of
groups. This is because the probability of at least one group being
compromised (more than 1/3 faulty replicas) increases rapidly as the
number of groups increases. We address this problem with a simple
modification to Castro and Liskovs BFT replication that allows for
arbitrary choice of n (number of replicas) and f (failure
threshold). The modified BFT protocol does not lose safety even when
failures are many, performs very well when failures are few, but may
lose liveness in the intermediate cases. This features seem to address
realistic failure models which account for software errors or security
attacks. The price to pay is a more restrictive liveness requirement,
and we present the design of a large-scale BFT replicated system that
obviates this problem.
|