006. Kafka Theory. Consumer Offsets & Delivery Semantics - MarkHuntDev/my-kafka-exercises GitHub Wiki

Consumer Offsets

  • Kafka stores the offsets at which a consumer group has been reading
  • The offsets committed live in a Kafka topic named __consumer_offsets
  • When a consumer in a group has processed data received from Kafka, it should be committing the offsets
  • If a consumer dies, it will be able to read back from where it left off thanks to the committed consumers offsets!

Delivery semantics for consumers

  • Consumers choose when to commit offsets
  • There are 3 delivery semantics:
    • At most once:
      • offsets are committed as soon as the message is received
      • If the processing goes wrong, the message will be lost (it won't be read again)
    • At least once (usually preferred):
      • offsets are committed after the message is processed
      • If the processing goes wrong, the message will be read again
      • This can result in duplicate processing of messages. Make sure your processing is idempotent (i.e. processing again the messages won't impact your systems)
    • Exactly once:
      • Can be achieved for Kafka => Kafka workflows using Kafka Streams API
      • For Kafka => External System workflows, use an idempotent consumer