Kafka - MacKittipat/note-developer GitHub Wiki

Kafka

  • Distributed event streaming platform.
  • Distributed commit log.
  • Publish/Subscribe messaging system.

Architecture

Producer ---write---> Kafka Cluster ---read---> Consumer

Components

Cluster

  • Consist of multiple brokers (Servers)

Topic

  • A stream of messages in Kafka, Immutable logs or Commit logs.
  • Broken down into a number of partitions.
  • Similar to table in RDBMS.

Partition

  • Each broker store one or more partition.
  • Partition can be replicate.

Message

  • A unit of data with in Kafka.
  • Consist of key/value pair.
  • Key used to select partition to be written to. Message with same key will be written to the same partition.

Producer

  • Create message.
  • Producer does not care what partition a message is written to and will balance messages over all partitions of a topic evenly.

Consumer

  • Pull message
  • Consumer subscribes to one or more topics and reads the messages in the order in which they were produced.
  • Can read message from multiple partitions.
  • Consumer offset is used for keeps track of the last message read

Consumer group

  • A group of consumer that work together to consume a topic.
  • Consumer group assures that each partition is only consumed by one member.

Queue vs Pub/Sub

  • Queue
    • 1 Consumer group with multiple consumer subscribe to the topic.
  • Pub/Sub
    • Multiple consumer group subscribe to same topic.

Topic vs Consumer

  • 1 Topic can be consume by multiple consumer group
  • 1 Partition can be consumed by 1 consumer in consumer group

image

Reference