THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Linchi Shea

Checking out SQL Server via empirical data points

The CAP Theorem

I always like simple frameworks that help put some order into chaos.

 

Last month while in Seattle, I ran into something that for whatever reason had managed to escape my radar screen. If there was a single thing that made the trip to Seattle worthwhile, that was it.

 

I’m talking about the CAP Theorem. I'm blogging about it here because after talking to a few people, I realized that not enough database people actually knew about it.

 

For any distributed systems that share data, there are three desirable properties: Consistency, Availability, and Partition tolerance, i.e. CAP). The CAP Theorem states that, out of these three desirable properties, you can have at most two simultaneously. The practical implication is that you must make trade-offs when building a system that provides distributed access to data.

 

The consistency property means that any part of the system will provide exactly correct data when responding to a request as if the request were processed on a single node in an instance. The availability property means that the system is ‘online’ and the client of the system can expect to receive a response for its request. And partition tolerance refers to tolerance to network partition in that the system should continue to function in case network is partitioned or disrupted (e.g. some nodes are offline or communication links are down).

 

So a scaled-up database at a single site—including one in a local failover cluster—has consistency and availability, but it’s meaningless to talk about network partition tolerance. With a distributed database with distributed locking, you can guarantee to deliver consistent data even in case of network partition, but you would have to sacrifice availability because if you continue to allow the same data to be updated on different nodes , the client will receive inconsistent data.

 

It’s interesting to apply the CAP Theorem to various familiar systems and scenarios. But instead of my repeating it, you can read about the CAP Theorem yourself. Here are two references, and more can be found when you google for “CAP Theorem”.

 

  1. Brewer’s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services
  2. Eric Brewer’s presentation
Published Wednesday, December 17, 2008 8:11 PM by Linchi Shea

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Dave Jermy said:

How would this apply to databases in the cloud?

December 23, 2008 5:33 AM
 

anonymouse said:

Well, "in the cloud" you are less likely to scale-up, so you have to scale-out. The CAP theorem says "scale-out" is *DIFFERENT* than scale-up.  In scale-out, you are going to get partitions, so you have to choose between availability (don't serve requests without a majority) and consistency (serve stale data).

This is different from a single-node database, where consistency is so trivial it's "assumed".

August 1, 2009 6:38 PM
 

Nathan Fiedler said:

It's worth noting that when designing a distributed storage system you do not have to entirely give up one of the properties to improve the other two (e.g. give up all hope of consistency to achieve high availability and partition tolerance). Instead, relaxing the consistency model can yield significant improvements in availability and partition tolerance, as seen in Amazon's Dynamo (http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html) and its "eventual consistency."

December 5, 2009 7:18 PM

Leave a Comment

(required) 
(required) 
Submit

About Linchi Shea

Checking out SQL Server via empirical data points

This Blog

Syndication

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement