Leader Epoch - srivalligade04/ConfluentExamPreparationNotes GitHub Wiki

In Apache Kafka, the leader epoch checkpoint is a mechanism used to track the history of leadership changes for each partition. It plays a crucial role in ensuring data consistency and replication correctness across brokers.

What is a Leader Epoch?

A leader epoch is a monotonically increasing number that changes every time a new leader is elected for a partition. It helps Kafka distinguish between different generations of leaders.

What is the Leader Epoch Checkpoint File?

  • Kafka stores leader epoch information in a file called: /tmp/kafka-logs//leader-epoch-checkpoint
  • Or in newer versions, it's stored in: /tmp/kafka-logs/leader-epoch-checkpoint
  • This file contains entries like:
  1. 0 0
  2. 1 100
  3. 2 200
  • Each line represents: <start_offset>
  • Epoch: The leader epoch number.
  • Start offset: The offset at which this epoch began.

Why Is It Important?

Replica Synchronization

Followers use this file to determine if they are in sync with the current leader.

Truncation Logic

If a follower has data from a stale epoch, it can truncate its log to match the leader’s state.

Avoiding Data Divergence

Ensures that replicas don’t continue from an outdated state, which could lead to data inconsistency.

Related Configurations

  • Kafka handles this internally; you typically don’t need to modify it.
  • However, understanding it is useful for debugging replication issues or log divergence.
⚠️ **GitHub.com Fallback** ⚠️