Run Kafka Zookeeper from inside A Docker Container for Spindle - rpi-nsl/NSL-Spindle GitHub Wiki

Kafka+Zookeeper in Docker (Ref: Spotify/Kafka)

This repository provides everything you need to run Kafka in Docker.

For convenience also contains a packaged proxy that can be used to get data from a legacy Kafka 7 cluster into a dockerized Kafka 8.

Why?

The main hurdle of running Kafka in Docker is that it depends on Zookeeper. Compared to other Kafka docker images, this one runs both Zookeeper and Kafka in the same container. This means:

  • No dependency on an external Zookeeper host, or linking to another container
  • Zookeeper and Kafka are configured to work together out of the box

Run

docker run -p 2181:2181 -p 9092:9092 --env ADVERTISED_HOST=`docker-machine ip \`docker-machine active\`` --env ADVERTISED_PORT=9092 spotify/kafka
export KAFKA=`docker-machine ip \`docker-machine active\``:9092
kafka-console-producer.sh --broker-list $KAFKA --topic test
export ZOOKEEPER=`docker-machine ip \`docker-machine active\``:2181
kafka-console-consumer.sh --zookeeper $ZOOKEEPER --topic test

Running the proxy (Don't need for SPINDLE)

Take the same parameters as the spotify/kafka image with some new ones:

  • CONSUMER_THREADS - the number of threads to consume the source kafka 7 with
  • TOPICS - whitelist of topics to mirror
  • ZK_CONNECT - the zookeeper connect string of the source kafka 7
  • GROUP_ID - the group.id to use when consuming from kafka 7
docker run -p 2181:2181 -p 9092:9092 \
    --env ADVERTISED_HOST=`boot2docker ip` \
    --env ADVERTISED_PORT=9092 \
    --env CONSUMER_THREADS=1 \
    --env TOPICS=my-topic,some-other-topic \
    --env ZK_CONNECT=kafka7zookeeper:2181/root/path \
    --env GROUP_ID=mymirror \
    spotify/kafkaproxy

In the box

  • spotify/kafka

    The docker image with both Kafka and Zookeeper. Built from the kafka directory.

  • spotify/kafkaproxy (Don't need for SPINDLE)

    The docker image with Kafka, Zookeeper and a Kafka 7 proxy that can be configured with a set of topics to mirror.

Public Builds

https://registry.hub.docker.com/u/spotify/kafka/

https://registry.hub.docker.com/u/spotify/kafkaproxy/ (Don't need for SPINDLE)

Build from Source (Don't need for SPINDLE)

docker build -t spotify/kafka kafka/
docker build -t spotify/kafkaproxy kafkaproxy/

Configuration for Spindle

The main idea to use the (kafka+zookeeper) in docker container is to make it act as a 'middleware' for Spindle. The aim is to keep it isolate from the kafka+zookeeper instance spawned by the Spindle Vehicle Node jar file.

Steps required:


  • Step 1: Make sure that docker is installed in the machine. Docker Installation Guide
  • Step 2: Go to Spotify/Kafka and docker pull spotify/kafka in your local machine. This will pull the docker image in your local machine.
  • Step 3: Now we make the docker container configuration for run:
docker run -p 2181:2181 -p 9092:9092 --env ADVERTISED_HOST=$your_machine_public_ip --env ADVERTISED_PORT=9092 spotify/kafka

So you now have the kafka running at $your_machine_public_ip:9092 and zookeeper at $your_machine_public_ip:2181.

  • Step 4: Now export the MIDDLEWARE_HOSTNAME = $your_machine_public_ip and the CLUSTERHEAD_BROKER=$your_machine_public_ip:9093 and CLUSTERHEAD_ZK_STRING=$your_machine_public_ip:2182
  • Step 5: Run the Spindle jar.
  • Step 6: In the Main.scala of the Test-Spark-Program, make it point to the zookeeper+kafka configuration of the docker container and sbt run it.

To Do

See easier methods.