CAP Theorem Explained Simply

The Interview Question That Got Me

“Tell me about the CAP theorem.”

I had heard the term. I said something about consistency and availability. The interviewer nodded slowly. Asked a follow-up. I stumbled.

I did not get the job.

Later I looked it up properly. Read three articles. All of them used database jargon and math symbols. I had to read each one twice.

This post is what I wish those articles had said. In plain English. With examples you actually recognise.



🧱 The Three Words — Explained Simply

C — Consistency

Every read gets the most recent write.

You update a record. Someone else reads it immediately after. They get the updated version. Not the old one. Not a stale copy. The fresh one.

Think of a bank account. You transfer £500. Your friend checks the balance one second later. They see the new balance. Not the old one. That is consistency.

A — Availability

Every request gets a response. Always.

Maybe the response is slightly old data. Maybe one server is struggling. But the system never refuses to answer. It always responds.

Think of a social media like counter. You press like. The number goes up immediately. Was it exactly accurate in that microsecond? Maybe not. But you got a response. The system did not say “sorry, busy.”

P — Partition Tolerance

The system keeps working even when network messages between servers get lost or delayed.

A partition is when two servers in your cluster cannot talk to each other. Maybe a cable cut. Maybe a network blip. For a few seconds — Server A and Server B are isolated.

Partition tolerance means your system keeps running even when that happens.



⚡ Why You Cannot Have All Three

Here is the real question. Why not? Why can’t a system be consistent, available, AND partition tolerant?

Because network partitions happen. In any distributed system — any system with more than one server — the network between them will sometimes fail. A cable. A router. A cloud hiccup. You cannot prevent it.

So partition tolerance is not really optional. You have to handle partitions. That means you are really choosing between C and A.

When a partition happens — what does your system do?

Option 1 — Stay consistent. Stop accepting writes until the partition heals. Return an error to users. “System unavailable.” Safe data. Unhappy users.

Option 2 — Stay available. Keep accepting writes on both sides of the partition. Risk having different data on each server until they sync back up. Happy users. Possibly stale data.

You cannot do both. That is the theorem.

In practice — partition tolerance is mandatory. So the real choice is: during a network partition, do you sacrifice Consistency or Availability?



🗄️ Real Databases — Which Choose What

CP Systems — Consistent but may go down during partition

MongoDB — Chooses consistency. During a partition — if the primary node cannot reach enough replicas — it stops accepting writes. Returns an error. Data is safe. But the system is unavailable for a bit.

Redis — Same idea. Prioritises data correctness. Will refuse writes rather than risk inconsistency.

HBase — Big data system used by Facebook and others. Consistent reads always. May be unavailable during failures.

AP Systems — Always available but may serve stale data

Cassandra — Chooses availability. Used by Netflix, Apple, Instagram. Always responds. Even during a partition — keeps accepting writes on all nodes. When the partition heals — nodes sync and reconcile differences.

DynamoDB — Amazon’s flagship NoSQL database. Highly available. Eventually consistent by default. You can request strong consistency — but that trades some availability.

CouchDB — Accepts writes everywhere. Conflicts resolved later. Always up.

CA Systems — Consistent and Available but cannot handle partitions

Traditional relational databases — PostgreSQL, MySQL. They are consistent and available. But they are designed for a single node or a controlled cluster. They assume the network is reliable. If you split a MySQL cluster across datacentres — you have a problem.

This is why the SQL vs NoSQL choice connects directly to CAP. NoSQL databases were partly built to be AP systems. To stay available at scale even during network issues.



🏢 How Real Companies Think About This

Netflix uses Cassandra — an AP system. When their servers have a network issue — Cassandra keeps serving data. A recommendation might be slightly stale. But the app keeps working. For Netflix — a brief stale recommendation is fine. An outage is not.

Banks use CP systems or CA systems. When you transfer money — both sides must agree. An inconsistency in a bank database is not a minor issue. It is fraud. Banks accept that the system might be briefly unavailable rather than risk inconsistent data.

Amazon’s shopping cart is famously AP. You add items to your cart. Even if two servers disagree for a moment — your cart still works. When they sync — duplicates are cleaned up. Amazon decided: an item appearing twice in your cart is less bad than the cart not working at all.

And this is why understanding CAP matters when you design a system. Are you building a shopping cart or a bank transfer? That question determines your database choice.

The read replicas pattern is a direct consequence of CAP thinking. You are choosing to spread reads across replicas — accepting that replicas might be slightly behind the primary — in exchange for better availability and performance.



🤔 PACELC — The Extension Nobody Mentions

Once you understand CAP — there is one more idea worth knowing. PACELC.

It says: even when there is no partition — you still have a trade-off. Between latency and consistency.

A perfectly consistent database is slower. It has to check all replicas agree before responding. A faster database is slightly less consistent — it responds before all replicas confirm.

So the full picture is:

  • During a Partition — choose between Availability and Consistency (CAP)
  • Else (normal operation) — choose between Latency and Consistency (ELC)

Most modern databases let you tune this. DynamoDB lets you choose strong consistency vs eventual consistency per request. Cassandra lets you set consistency levels. You are not stuck with one mode.



The One Question That Makes This Simple

Forget the theory for a second.

Ask yourself this about your system:

“What is worse for my users — wrong data or no response?”

If wrong data is worse — choose CP. Be consistent. Accept that sometimes the system will be unavailable.

If no response is worse — choose AP. Stay available. Accept that sometimes data might be slightly stale.

That is the CAP theorem. Everything else is detail.


Numbers and examples in this post are based on public data as of 2025.