Sloppy Quorum And Eventual Consistency
Here is where we stand. Fisher-Lynch-Patterson has shown that consensus is not guaranteed in bounded time in a purely asynchronous network. The CAP theorem shows that from consistency, availability, and partition-resilience, we could only choose two. We have seen systems using ACID transactions and high powered consensus protocols such as Paxos, Viewstamp Replication, and Raft. We have been choosing to stand on the CP side, forgoing availability to ensure consistency under network partition.
After all, isn’t it nice for a read to always see the latest write?
But availability is quite a sacrifice. When you Google something, or check your Twitter NewsFeed, would you rather have some results/tweets show up, albeit a little stale, than be denied service just because you happen to be on the minority side of a network partition? Or when I add an item to shopping cart, I really do not expect it to reject such operation in any case. It is worthwhile to reconsider the tradeoff between availability and consistency under these scenarios.
Classic consensus algorithms serialize all operations at the primary / leader / master / coordinator that maintains the single-copy semantics to preserve consistency. Consistency could be generalized to a multi-coordinator case with NWR quorum where a write quorum intersects with a read quorum, or W + R > N. In the following example, N is 5 and W = R = 3.
We could even optimize read for lower latency by letting R = 2 and W = 4. The nodes might scatter across a wide area, and the less confirmations we have to wait, the sooner we could commit.
We could push further down for lower latency in that R + W < N + 1 where we have only eventual consistency. It is weaker since a write might not be immediately available for subsequent read, but eventually it will, as two quorums might be disjoint. In the next example, Alice first writes a new value to v and then tell Bob to read it, but Bob still observes a stale value.
Eventually, Bob will be able to read the latest write. But how eventual? How long does Bob have to wait? The answer is that we can give promises but we could provide expectations using probability bounded staleness as a way to quantify latency-consistency trade-offs. We define t-visibility as getting consistent reads with probability p after t seconds. Here is some interesting figure from LinkedIn.