Kafka Fundamental - Tuong-Nguyen/Angular-D3-Cometd GitHub Wiki
Example: LinkedIn Data Architecture
This image shows data architecture of LinkedIn Company which uses Kafka as communication mechanism:
Producer - Consumer - Topic
Producer: Application which sends messages to Kafka system.
Consumer: Application which receives messages from Kafka system.
Topic: Message are categorized in Topic. Producer will send messages to a Topic and Consumer will receive them from the same topic.
Messages in a Topic
Topic will store messages with following characteristics:
- append-only (ie: cannot insert into the queue)
- order by time (ie: message are sorted)
- immutable (ie: Message cannot be updated after sent)
In the picture, the message 3 contains incorrect information. Producer must send message 6 as corrective information. It cannot update message 3 with correct information.
Message information
The message has some information:
- Timestamp
- ID: Message ID used by consumer for specified the offset
- Data: message content in binary data.
Broker - Zookeeper
Broker: Kafka server which consumer and producer connect for sending and receiving messages. Zookeeper: Server for managing brokers for scalability.
Topic Partition
A topic can be split into partitions. Each partition is handled by a broker (A broker can handle multiple partitions). Messages sent to this topic will be distributed in the partitions.
How messages are distributed into Partions is discussed later.
Each broker manages a partition.
Topic Replication
Topic can be configured to be replicated in multiple Brokers. This increases the availability.