Run multiple Nodes in Docker from DEV environment - rpi-nsl/NSL-Spindle GitHub Wiki

To run multiple vehicle nodes in the DEV environment (middle-ware + clusterhead+ nodes in a single host).

We will need to simulate multiple copies of the Vehicle Node on a single machine, along with the middle-ware setup and Apache Spark running on the same machine, for the Dev environment. We will use Docker for this purpose. We will build a docker image on top of the Spindle Vehicle Jar file such that, we can then run multiple copies of that docker image. These multiple copies of running docker image i.e. docker containers are completely isolated from each-other but runs on the same machine. Alternatively instead of building one, we can use a pre-build docker image for the same purpose.

There are two ways to do achieve the above goal:

Use a preexisting docker image + Continue with rest of the stuff.
Build your own docker image + Continue with rest of the stuff.

Use a pre-existing docker image. (Preferable method)

Make sure Docker is installed in your computer. If not please refer to Docker Installation Guide. Now before you proceed further, you need to add yourself to the the docker group so that you can run docker commands as a non-root user. Please refer to Manage Docker as a non-root user.

After this, run docker pull nslrpi/spindle-node Running this command will download the docker image for the Spindle on your machine from docker hub NSL repository.

Download Apache Kafka in your computer to some desired folder. You can download Kafka from the following page Link and un-tar it to the folder you want i.e. RUN:

> tar -xzf kafka_2.11-0.10.2.0.tgz
> mv kafka_2.11-0.10.2.0 kafka

In file ~/kafka/config/server.properties change the advertised.listeners=PLAINTEXT://127.0.0.1:9092 to advertised.listeners=PLAINTEXT://$your_Public_IP:9092, and then you need to start Zookeeper and Kafka.

To start Zookeeper + Kafka, do the following steps:

Start the Zookeeper Server (we will start a single node zookeeper instance):
> ./bin/zookeeper-server-start.sh config/zookeeper.properties
Start the Kafka Server:
> ./bin/kafka-server-start.sh config/server.properties
This will start the kafka+Zookeeper setup in your local machine which will act as your middle-ware device now. Make sure that the Kafka+Zookeeper in your middle-ware is already running before you start any docker container.

Go inside ~/NSL-Spindle/Vehicle/Vehicle-Node folder.
Edit the runSimulations.sh file and assign $your_public_ip to all the middle-ware_HOST_NAME variables inside the file. This makes the docker containers know where the middle-ware advertised listener port is.
RUN

> ./runSimulations.sh $i

where $i is the number of node instances you want. The cluster head will be an additional node which will spawn apart from these $i nodes.

On executing this, the docker nodes should be up and running as separate docker containers. To check the running containers, run, docker ps; the clusterhead should be named as SPINDLE-CLUSTERHEAD and the rest of the nodes as SPINDLE-NODE$i. The following is a sample output: images/docker_ps_output.jpg
To use spark, go to ~/NSL-Spindle/Test-Spark-Application/ and point the /src/main/scala/edu/rpi/cs/nsl/spindle/spark/test/Main.scala to middle-ware IP. Refer to Configure Test Spark Program.
You should be able to see results in the Spark output now.
To stop the simulation look at this documentation.

Build your own docker image . (Use only when you have changed something in the base Jar file and need to rebuild the docker image)

Make sure Docker is installed in your computer. If not please refer to Docker Installation Guide. Now before you proceed further, you need to add yourself to the the docker group so that you can run docker commands as a non-root user. Please refer to Manage Docker as a non-root user.
Download Apache Kafka in your computer to some desired folder. You can download Kafka from the following page Link and un-tar it to the folder you want i.e. RUN:

> tar -xzf kafka_2.11-0.10.2.0.tgz
> mv kafka_2.11-0.10.2.0 kafka

To start Zookeeper + Kafka, do the following steps:

Start the Zookeeper Server (we will start a single node zookeeper instance):
bin/zookeeper-server-start.sh config/zookeeper.properties
Start the Kafka Server:
bin/kafka-server-start.sh config/server.properties
This will start the kafka+Zookeeper setup in your local machine which will act as your middle-ware device now. Make sure that the Kafka+Zookeeper in your middle-ware is already running before you start any docker container.

Go inside ~/NSL-Spindle/Vehicle/Vehicle-Node and run sbt assembly to build the base Spindle Jar File.
Go inside ~/NSL-Spindle/Vehicle/Vehicle-Node and run ./build.sh'. This will build a docker image on top of the jar and deploy it to nslrpi` dockerhub account.
Now you will have the docker image in your computer. I will assume, docker is already installed form 1st step. Go inside ~/NSL-Spindle/Vehicle/Vehicle-Node folder.
Edit the runSimulations.sh file and assign $your_public_ip to all the middle-ware_HOST_NAME variables inside the file. This makes the docker containers know where the middle-ware advertised listener port is.
RUN

> ./runSimulations.sh $i

where $i is the number of node instances you want. The cluster head will be an additional node which will spawn apart from these $i nodes.

On executing this, the docker nodes should be up and running as separate docker containers. To check the running containers, run, docker ps; the clusterhead should be named as SPINDLE-CLUSTERHEAD and the rest of the nodes as SPINDLE-NODE$i. The following is a sample output: images/docker_ps_output.jpg
To use spark, go to ~/NSL-Spindle/Test-Spark-Application/ and point the /src/main/scala/edu/rpi/cs/nsl/spindle/spark/test/Main.scala to middle-ware IP. Refer to Configure Test Spark Program.
You should be able to see results in the Spark output now.
To stop the simulation look at this documentation.

To STOP

To stop the simulation, assuming you are already inside the folder: ~/NSL-Spindle/Vehicle/Vehicle-Node/ RUN:

> ./stopSimulation.sh

This will copy all log files from inside containers to Host and then kill all the containers and stop the simulation. The log files will available inside the folder ~/NSL-Spindle/Vehicle/Vehicle-Node/Node-Logs/ with a separate folder for each node. An example content of Node-Logs folder: images/log_folders.jpg

Configure Test Spark Program

Go in the ~/NSL-Spindle/Test-Spark-Program folder, edit the /src/main/scala/edu/rpi/cs/nsl/spindle/spark/test/Main.scala file, set the StreamConfig to point to the middleware so configure it as:

 val stream = NSLUtils.createVStream(ssc, NSLUtils.StreamConfig("middleware_public_ip:zk_port", "middleware_public_ip:kafka_port", TOPIC), new MockQueryUidGenerator)
 	.map(foo)
 	.reduceByKey{bar}
 	.print()

do sbt run to run from inside Test-Spark-Program