πŸ”„ Consensus Protocol in Distributed Systems

Understanding how distributed nodes reach agreement

← Home

1. What is a Distributed System?

A distributed system is a collection of independent computers (nodes) that work together as a single system. Examples include databases like Google Spanner or blockchain networks like Bitcoin and Ethereum.

Since these nodes communicate over potentially unreliable networks, keeping them synchronized is challenging.

2. What is Consensus?

Consensus means agreement. In distributed systems, it refers to getting all nodes to agree on a single value or state, even if some nodes fail or send incorrect information.

Goal: All non-faulty nodes should eventually agree on the same result.

Example: Multiple database servers must agree on the order of transactions or the identity of the leader node.

3. What is a Consensus Protocol?

A consensus protocol is an algorithm that ensures nodes agree on a common value (such as a transaction order, leader, or block) even in the presence of failures.

Protocol Main Use Key Feature
Paxos Databases, distributed coordination High fault tolerance
Raft Cluster management (e.g., etcd, Kubernetes) Simpler to understand and implement than Paxos
PBFT (Practical Byzantine Fault Tolerance) Blockchain systems Tolerates malicious (Byzantine) nodes
Proof of Work / Proof of Stake Blockchains Achieve consensus over open, untrusted networks

4. Why is Consensus Needed?

Let’s make it intuitive with an example:

If they update separately, their balances may diverge. Consensus ensures all servers agree on the same operation order, maintaining consistency.

Consensus ensures:

Consensus provides:

5. Real-world Example β€” Raft in Kubernetes / etcd

System: Kubernetes uses etcd (a distributed key-value store) for storing cluster state.
Consensus Protocol: Raft

How it works:

  1. etcd runs on multiple nodes (e.g., 3 or 5).
  2. One node is elected as the leader.
  3. Configuration changes go to the leader.
  4. The leader replicates changes to followers.
  5. Once a majority acknowledges, the change is committed.
  6. If a node fails, the system remains consistent and operational.
Raft Consensus Protocol Diagram

Quick Recap

Concept Meaning
Consensus Agreement among distributed nodes on a single value or state
Why Needed Ensures data consistency and fault tolerance
Common Protocols Paxos, Raft, PBFT, PoW/PoS
Example System etcd (used by Kubernetes) uses Raft for consistent cluster state

6. Raft Leader Failure Handling

Let’s see what happens when the leader fails in Raft β€” this is key to its reliability.

Step 1: Leader Fails

Imagine 5 nodes: A, B, C, D, E. B is the leader. If B crashes or loses network connectivity, followers stop receiving heartbeats from B.

Step 2: Detecting the Failure

Each follower expects regular heartbeats from the leader. If they stop hearing from it for a timeout period (e.g., 150–300 ms), one follower (say C) assumes the leader is dead and becomes a candidate.

  1. Increments its term number (a logical clock).
  2. Requests votes from other nodes.
  3. If it receives a majority of votes, it becomes the new leader.

Step 3: Resuming Operations

Once C is elected leader:

Step 4: Fault Tolerance

Even if 1 or 2 nodes fail, the cluster continues to operate normally. Raft only needs a majority to function β€” in a 5-node cluster, 3 working nodes are enough.

Raft Leader Failure and Recovery Diagram

Leader Failure Handling Summary

Stage Description
Failure Detection Followers stop receiving heartbeats from the leader
Leader Election A follower becomes a candidate and requests votes
New Leader Elected Majority votes decide the new leader
Log Synchronization Uncommitted logs are removed; committed entries retained
System Recovery Cluster resumes normal operation