Kafka - MacKittipat/note-developer GitHub Wiki
Kafka
- Distributed event streaming platform.
- Distributed commit log.
- Publish/Subscribe messaging system.
Architecture
Producer ---write---> Kafka Cluster ---read---> Consumer
Components
Cluster
- Consist of multiple brokers (Servers)
Topic
- A stream of messages in Kafka, Immutable logs or Commit logs.
- Broken down into a number of partitions.
- Similar to table in RDBMS.
Partition
- Each broker store one or more partition.
- Partition can be replicate.
Message
- A unit of data with in Kafka.
- Consist of key/value pair.
- Key used to select partition to be written to. Message with same key will be written to the same partition.
Producer
- Create message.
- Producer does not care what partition a message is written to and will balance messages over all partitions of a topic evenly.
Consumer
- Pull message
- Consumer subscribes to one or more topics and reads the messages in the order in which they were produced.
- Can read message from multiple partitions.
- Consumer offset is used for keeps track of the last message read
Consumer group
- A group of consumer that work together to consume a topic.
- Consumer group assures that each partition is only consumed by one member.
Queue vs Pub/Sub
- Queue
- 1 Consumer group with multiple consumer subscribe to the topic.
- Pub/Sub
- Multiple consumer group subscribe to same topic.
Topic vs Consumer
- 1 Topic can be consume by multiple consumer group
- 1 Partition can be consumed by 1 consumer in consumer group