CAP - rFronteddu/general_wiki GitHub Wiki

The CAP Theorem is a fundamental theorem in distributed systems that states any distributed system can have at most two of the following three properties.

  • Consistency: any read operation that begins after a write operation completes must return that value, or the result of a later write operation
  • Availability: every request received by a non-failing node in the system must result in a response
  • Partition tolerance: the network will be allowed to lose arbitrarily many messages sent from one node to another

The CAP theorem states that a distributed system cannot simultaneously be consistent, available, and partition tolerant.

In a consistent system, once a client writes a value to any server and gets a response, it expects to get that value (or a fresher value) back from any server it reads from.

In an available system, if our client sends a request to a server and the server has not crashed, then the server must eventually respond to the client. The server is not allowed to ignore the client's requests.

Our system has to be able to function correctly despite arbitrary network partitions in order to be partition tolerant.

Proof

  • Assume for contradiction that there does exist a system that is consistent, available, and partition tolerant. The first thing we do is partition our system. It looks like this.
  • Next, we have our client request that $v_1$ be written to $G_1$. Since our system is available, $G_1$ must respond. Since the network is partitioned, however, $G_1$ cannot replicate its data to $G_2$. Gilbert and Lynch call this phase of execution $\alpha_1$.
  • Next, we have our client issue a read request to $G_2$. Again, since our system is available, $G_2$ must respond. And since the network is partitioned, $G_2$ cannot update its value from $G_1$. It returns $v_0$. Gilbert and Lynch call this phase of execution $\alpha_2$.
  • $G_2 returns $v_0$ to our client after the client had already written $v_1$ to $G_1$. This is inconsistent.

We assumed a consistent, available, partition tolerant system existed, but we just showed that there exists an execution for any such system in which the system acts inconsistently. Thus, no such system exists.

The CAP theorem states that it is impossible for a distributed software system to simultaneously provide more than two out of three of the following guarantees:

  • CONSISTENCY: All nodes see the same data at the same time. Achieved by updating several nodes before allowing further reads.
  • AVAILABILITY: Every request gets a response on success/failure. Achieved by replicating the data across different servers.
  • PARITITIONING: The system continues to work despite message loss or partial failure. A system is partition-tolerant if it can sustain any amount of network failure that doesn't result in a failure of the entire network.

We cannot build a general data store that is continually available, sequentially consistent, and tolerant to any partition failures. We can only build a system that has any two of these three properties.

  • To be consistent, all nodes should see the same set of updates in the same order but in case of partition, updates in one partition may not make it to the other partitions before a client reads stale data. The only thing that can be done is to not serve request from the out-of-date partition but then the service is no longer 100% available.

  • Availability + Consistency: RDBMS

  • Availability + Partition Tolerance: Cassandra/CouchDB

  • Partition Tolerance + Consistency: BigTable, MongoDB, HBase.