000. Foreword - MarkHuntDev/my-kafka-exercises GitHub Wiki

Why Apache Kafka?

  • Created by LinkedIn, now Open Source Project mainly maintained by Confluent
  • Distributed, resilient architecture, fault tolerant
  • Horizontal scalability:
    • Can scale to 100s of brokers
    • Can scale to millions of messages per second
  • High performance (latency of less than 10ms) - real time
  • Used by the 2000+ firms, 35% of the Fortune 500:
    • LinkedIn, Airbnb, Netflix, Uber, Walmart

Apache Kafka: Use cases

  • Messaging System
  • Activity Tracking
  • Gather metrics from many different locations
  • Application Logs gathering
  • Stream processing (with the Kafka Streams API or Spark for example)
  • De-coupling of system dependencies
  • Integration with Spark, Flink, Storm, Hadoop, and many other Big Data technologies

For example...

  • Netflix uses Kafka to apply recommendations in real-time while you're watching TV shows
  • Uber uses Kafka to gather user, taxi and trip data in real-time to compute and forecast demand, and compute surge pricing in real-time
  • LinkedIn uses Kafka to prevent spam, collect user interactions to make better connection recommendations in real time

Remember that Kafka is only used as a transportation mechanism!