kafka - ghdrako/doc_snipets GitHub Wiki

Kafka

Kakfa is a popular message broker that works on the concepts of producers publishing messages, called events, to topics. The events in a topic are split into partitions, using a partition key inside of the topic, and FIFO ordering is maintained inside every partition. Events can be streamed to consumers over a socket, or queried by the consumers for a more decoupled approach. For consumers that don't want to maintain state, the concept of a consumer group applies, same as Redis Streams. A consumer group is effectively a queue, where every event posted to a topic is available for processing in every associated consumer group.

Kafka is open source, but is a complicated to install and maintain, which makes it suitable for larger projects and teams. It scales based on how well you split your events in to partitions—the more partitions you have the more Kafka can distribute work, and each partition has only as much capacity as the server that's in charge of managing it. Managed hosting options are avaiable, but they tend to have high base costs compared to managed services like SNS+SQS, Pub/Sub or RabbitMQ.

Here are some of the key features of Apache Kafka:

  • Distributed and fault-tolerant: Kafka’s distributed nature enables it to scale horizontally and provide high availability, ensuring no single point of failure
  • Durability: Kafka stores messages on disk, ensuring data durability in the event of node failures
  • Stream processing: Kafka Streams, a lightweight library for building stream processing applications, allows developers to process and analyze real-time data within their Kafka applications
  • Connectors: Kafka Connect, a framework for connecting Kafka with external systems, simplifies the integration process with various data sources and sinks

obraz

Integration - Publikacja schematu do Schema Registry