UTCS Professor Vijay Chidambaram and undergraduate researcher Eric Lee co-authored a paper titled "Protocol-aware Recovery for Consensus-based Storage'' which won the best paper prize at the 2018 USENIX Conference on File and Storage Technologies (FAST).
Their work, which was done jointly with researchers at the University of Wisconsin-Madison, shows how data stores can be made more resilient to failures in distributed-systems.
Describing the work in informal terms, Chidambaram noted that large distributed systems have many machines that crash and recover constantly, so ensuring correct recovery is challenging. Although distributed systems are adept at recovering from server crashes, the team found that many systems do not recover properly from data corruption; in fact under some failure conditions, corrupted data from a single node spreads to the entire system.
Their new approach, which they call "Protocol-Aware Recovery," makes distributed systems more reliable without significant decrease in performance, permitting data to be recovered correctly even if machines in the distributed system crash or storage in the machines gets corrupted.