Updated run in dev environment - rpi-nsl/NSL-Spindle GitHub Wiki
To run multiple vehicle nodes in the DEV environment (middle-ware + clusterhead+ nodes in a single host).
We will need to simulate multiple copies of the Vehicle Node on a single machine, along with the middle-ware setup and Apache Spark running on the same machine, for the Dev environment. We will use Docker for this purpose. We will build a docker image on top of the Spindle Vehicle Jar file such that, we can then run multiple copies of that docker image. These multiple copies of running docker image i.e. docker containers are completely isolated from each-other but runs on the same machine. Alternatively instead of building one, we can use a pre-build docker image for the same purpose.
There are two ways to do achieve the above goal:
- Use a preexisting docker image + Continue with rest of the stuff.
- Build your own docker image + Continue with rest of the stuff.
Use a pre-existing docker image. (Preferable method)
- Make sure Docker is installed in your computer. If not please refer to Docker Installation Guide. Now before you proceed further, you need to add yourself to the the docker group so that you can run docker commands as a non-root user. Please refer to Manage Docker as a non-root user.
After this, run
docker pull nslrpi/spindle-node
Running this command will download the docker image for the Spindle on your machine from docker hub NSL repository.
- Install Kafka+Zookeeper using a docker image or the provided script.
docker pull spotify/kafka
This docker image is a self contained kafka instance with zookeeper. Optionally, you can install kafka/zookeeper yourself. See instructions below if you want to install it yourself.
-
Go inside
~/NSL-Spindle/Vehicle/Vehicle-Node
folder. -
Run
./setupSpindle.sh
This will pull the spotify kafka docker image and set up the spindle docker bridge. This gives us greater control of how our containers run and makes it easier to resolve container addresses. -
RUN
> ./runSimulations.sh $i
where $i
is the number of node instances you want. The cluster head will be an additional node which will spawn apart from these $i
nodes. Optionally, a second parameter can be used to indicate the number of clusters you want to use, each with $i
nodes. Note that the clusterhead also acts as a regular node
-
On executing this, the docker nodes should be up and running as separate docker containers. To check the running containers, run,
docker ps
; the clusterhead should be named asSPINDLE-CLUSTER0-CLUSTERHEAD
and the rest of the nodes asSPINDLE-CLUSTER0-NODE$i
. The following is a sample output: images/docker_ps_output.jpg -
To use spark, go to
~/NSL-Spindle/Test-Spark-Application/
and run./runSpark.sh
. This will automatically point spark to the middleware. Optionally, refer to Configure Test Spark Program.
Note that you must have sbt installed. If you do not, you can install it here
-
You should be able to see results in the Spark output now. If not, you may want to set up forwarding for docker bridge networks.
-
To stop the simulation look at this documentation.
Build your own docker image . (Use only when you have changed something in the base Jar file and need to rebuild the docker image)
-
Make sure Docker is installed in your computer. If not please refer to Docker Installation Guide. Now before you proceed further, you need to add yourself to the the docker group so that you can run docker commands as a non-root user. Please refer to Manage Docker as a non-root user.
-
Download Apache Kafka in your computer to some desired folder. You can download Kafka from the following page Link and
un-tar
it to the folder you want i.e. RUN:
> tar -xzf kafka_2.11-0.10.2.0.tgz
> mv kafka_2.11-0.10.2.0 kafka
In file ~/kafka/config/server.properties
change the advertised.listeners=PLAINTEXT://127.0.0.1:9092
to advertised.listeners=PLAINTEXT://$your_Public_IP:9092
, and then you need to start Zookeeper and Kafka.
To start Zookeeper + Kafka, do the following steps:
- Start the Zookeeper Server (we will start a single node zookeeper instance):
bin/zookeeper-server-start.sh config/zookeeper.properties
- Start the Kafka Server:
bin/kafka-server-start.sh config/server.properties
This will start the kafka+Zookeeper setup in your local machine which will act as your middle-ware device now. Make sure that the Kafka+Zookeeper in your middle-ware is already running before you start any docker container.
-
Go inside
~/NSL-Spindle/Vehicle/Vehicle-Node
and runsbt assembly
to build the base Spindle Jar File. -
Go inside
~/NSL-Spindle/Vehicle/Vehicle-Node
and run./build.sh'. This will build a docker image on top of the jar and deploy it to
nslrpi` dockerhub account. -
Now you will have the docker image in your computer. I will assume, docker is already installed form 1st step. Go inside
~/NSL-Spindle/Vehicle/Vehicle-Node
folder. -
Edit the
runSimulations.sh
file and assign $your_public_ip to all themiddle-ware_HOST_NAME
variables inside the file. This makes the docker containers know where the middle-ware advertised listener port is. -
RUN
> ./runSimulations.sh $i
where $i
is the number of node instances you want. The cluster head will be an additional node which will spawn apart from these $i
nodes.
-
On executing this, the docker nodes should be up and running as separate docker containers. To check the running containers, run,
docker ps
; the clusterhead should be named asSPINDLE-CLUSTERHEAD
and the rest of the nodes asSPINDLE-NODE$i
. The following is a sample output: images/docker_ps_output.jpg -
To use spark, go to
~/NSL-Spindle/Test-Spark-Application/
and point the/src/main/scala/edu/rpi/cs/nsl/spindle/spark/test/Main.scala
to middle-ware IP. Refer to Configure Test Spark Program. -
You should be able to see results in the Spark output now.
-
To stop the simulation look at this documentation.
To STOP
To stop the simulation, assuming you are already inside the folder: ~/NSL-Spindle/Vehicle/Vehicle-Node/
RUN:
> ./stopSimulation.sh
This will copy all log files from inside containers to Host and then kill all the containers and stop the simulation. The log files will available inside the folder ~/NSL-Spindle/Vehicle/Vehicle-Node/Node-Logs/
with a separate folder for each node. An example content of Node-Logs
folder:
images/log_folders.jpg
Optionally, you can use the hard stop script to just kill all the running spindle nodes.
> ./stopSimulation.sh
If you want to pull the log files like before, you can use the catWeight script, which will place the logs in LogOutput
, overwriting any existing log files.
> ./catWeight.sh
Configure Test Spark Program
-
Go in the
~/NSL-Spindle/Test-Spark-Program
folder, edit the/src/main/scala/edu/rpi/cs/nsl/spindle/spark/test/Main.scala
file, set theStreamConfig
to point to the middleware so configure it as:val stream = NSLUtils.createVStream(ssc, NSLUtils.StreamConfig("middleware_public_ip:zk_port", "middleware_public_ip:kafka_port", TOPIC), new MockQueryUidGenerator) .map(foo) .reduceByKey{bar} .print()
-
do
sbt run
to run from inside Test-Spark-Program