Building MAD Distributed Systems
The goal of this project is to develop the theory and practice
required for building cooperative services that span multiple
administrative domains (MADs).
MAD systems are attractive because
their diffused control structure may yield services that are
potentially less costly and more democratic than their more
centralized counterparts. Unfortunately, they are also particularly
problematic from a dependability standpoint as they challenge the
traditional distinction between correct and faulty nodes.
Nodes in a MAD system can, as always, deviate from their specification
because they are broken, on account of bugs, errors in software
configuration, or even malicious attacks. But MAD systems add a new
dimension: without a central administrator to ensure that all unbroken
nodes follow faithfully their assigned protocol, nodes may deviate
from their specification also because they are selfish and are intent
on maximizing their own utility.
Byzantine Fault Tolerance (BFT) handles the first class of deviations
well. However, the Byzantine model classifies all deviations as
faults and requires a bound on the number of faults in the system;
this bound is not tenable in MAD systems where all nodes may benefit
from selfish behavior and be motivated to deviate from the protocol.
Models based on traditional game theory only account for rational
behavior and are therefore brittle: they handle the second class of
selfish deviations, but may be vulnerable to arbitrary disruptions if
even a single node is broken and deviates from expected rational
behavior.
The challenge in developing a solid foundation for constructing MAD
services is then (at least) threefold: (1) to develop a model for MAD
services in which it is possible to reason and prove properties of MAD
services; (2) to understand how to simplify the development of MAD
services under the new model, (3) to demonstrate that MAD services
developed under this model can be practical by building and deploying
useful applications.
Our first steps towards addressing these issues appear in
- A.S. Ayer, L. Alvisi, A. Clement, M. Dahlin, J.P. Martin, and
C. Porth.
BAR Fault Tolerance for
Cooperative Services.
In Proceedings of the
20th ACM Symposium on Operating Systems Principles (SOSP 2005),
Brighton, United Kingdom, October 2005, pp. 45-58. Award Paper.
where we introduce the BAR model, named after the initials of the
three classes of nodes (Byzantine, Altruistic, and Rational) that it
explicitly considers. Byzantine nodes can deviate arbitrarily from
their specification, even if doing so is against their interest.
Altruistic nodes follow their specification faithfully, without
consideration of their self interest. Rational nodes behave selfishly
and deviate from a given protocol if doing so improves their own
utility. The paper illustrates the construction of a BAR peer-to-peer
backup service at whose core is a BAR-tolerant asynchronous replicated
state machine protocol that uses Rational and Byzantine nodes to
implement the abstraction of an Altruistic node.
To further demonstrate the practicality of BAR, the following two
OSDI papers:
- H. Li, A. Clement, E. Wong, J. Napper, L. Alvisi and M. Dahlin
BAR Gossip
In Proceedings of the 7th
Symposium on Operating System Design and Implementation (OSDI '06)
,
Seattle, WA, November 2006, pp. 191-204.
-
H. Li, A. Clement, M. Marchetti, E. Kapritsos,
L. Robison, L. Alvisi, and M. Dahlin
FlightPath: Obedience vs. Choice in Cooperative Services
In Proceedings of the 8th Symposium
on Operating System Design and Implementation (OSDI '08) ,
San
Diego, CA, December 2008, pp. 355-368.
investigate the design tradeoffs involved in the design and
implementation of a BAR-tolerant data streaming service.
The earlier paper uses a BAR-tolerant gossip protocol to achieve the first
peer-to-peer streaming media application that ensures high,
predictable throughput even when most or all of the nodes act
selfishly and the remainder act maliciously or malfunction in
arbitrary ways. The broader significance of this result consists in
demonstrating the possibility of designing efficient BAR-tolerant versions of
primitives, such as gossip, whose inner working relies heavily on
randomness. Randomness, as any source of non-determinism, is hard to
deal with in the BAR model because it gives opportunities for rational
users to hide selfish actions in the guise of legitimate,
non-deterministic behavior.
This page is being updated. Please come back soon for a discussion
of our more recent results.
Relevant Publications:
-
E. Wong, and L. Alvisi
What's a Little Collusion Between Friends?
In Proceedings of the 32nd ACM Symposium on Principles of
Distributed COmputing (PODC 2013),
Monteral, Canada, July 2013.
-
E. Wong, and L. Alvisi
Regret Freedom isn't Free
In Proceedings of the 15th International Conference on
Distributed Systems (OPODIS '11),
Toulouse, France, October 2011.
-
I. Abraham, L. Alvisi, and J. Halpern
Distributed Computing meets Game Theory: combining insights from two fields
In SIGACT News, 42(2),June 2011..
-
E. Wong, L. Alvisi, and J. Leners
It's
On Me! The Benefit of Altruism in BAR Environments
In Proceedings of the 24th International Symposium on
Distributed Computing (DISC 2010),
Boston, MA, September 2010.
An extended version can be found in:
E. L. Wong and L. Alvisi
It's
On Me! The Benefit of Altruism in BAR Environments
UT Austin Technical Report TR-10-08.
-
F. Mari, I. Melatti, I. Salvo, E. Tronci, L. Alvisi, A. Clement,
and H. Li
Model
Checking Coalition Nash Equilibria in MAD Distributed
Systems
In Proceedings of the 11th International
Symposium on Stabilization, Safety, and Security of Distributed
Systems (SSS 2009) ,
Lyon, France, November 2009, pp. 531-546.
-
H. Li, A. Clement, M. Marchetti, E. Kapritsos,
L. Robison, L. Alvisi, and M. Dahlin
FlightPath: Obedience vs. Choice in Cooperative Services
In Proceedings of the 8th Symposium
on Operating System Design and Implementation (OSDI '08) ,
San
Diego, CA, December 2008, pp. 355-368.
-
F. Mari,
I. Melatti, I. Salvo, E. Tronci, A. Clement, and H. Li
Model Checking Nash
Equilibria in MAD Distributed Systems
In
Proceedings of the 8th Conference on Formal Methods in Computer-Aided
Desing (FMCAD 08),
Portland, OR, November 2008, pp. 1-8.
-
A. Clement, H. Li, J. Napper,
J. P. Martin, L. Alvisi, and M. Dahlin
BAR Primer
In Proceedings of the International
Conference on Dependable Systems and Networks (DSN 2008), DCC
Symposium,
Anchorage, AK, June 2008, pp. 287-296.
-
H. Li, A. Clement, E. Wong,J. Napper, and M. Dahlin
BAR Gossip
In Proceedings of the 7th
Symposium on Operating System Design and Implementation (OSDI '06)
,
Seattle, WA, November 2006, pp. 191-204.
-
A.S. Ayer, L. Alvisi, A. Clement,
M. Dahlin, J.P. Martin, and C. Porth
BAR Fault Tolerance for
Cooperative Services.
In Proceedings of the
20th ACM Symposium on Operating Systems Principles (SOSP 2005),
Brighton, United Kingdom, October 2005, pp. 45-58. Award Paper.
Faculty
Students
Alumni
External collaborators
Sponsors
This material is based upon work supported by:
- The National Science Foundation under current Grant No. 0905625 and
under expired Grant No. 0509338.
Any opinions, findings, and conclusions or recommendations expressed
in this material are those of the author(s)
and do not necessarily
reflect the views of the National Science Foundation (NSF).
Back to Lorenzo
Alvisi's Home page.