Getting Started with Distributed Version - PetcuAlexandru/openpixi GitHub Wiki
This manual is intended for a more experienced user who wants to build the distributed version of OpenPixi from sources. A basic knowledge or willingness to learn maven is of a great advantage. A short overview of the sections in this document follows
- Restrictions
- How do I compile the distributed version?
- How do I run the distributed version?
- How do I test the distributed version?
- How do I profile the distributed version?
- How do I trace the particle movement?
Currently, there are few restrictions to the distributed version; namely, the number of grid cells in x and y direction as well as the number of nodes among which the computation is distributed have to be powers of two.
First of all, you need to install IBIS framework which is used for the communication. The IBIS framework is available at the download page of openpixi project. To download and unzip the package on a Linux machine you can run the following commands from terminal
wget --no-check-certificate https://github.com/downloads/openpixi/openpixi/ipl-standalone.zip
unzip ipl-standalone.zip
To install IBIS navigate to the same directory where ipl-2.3-standalone.jar
is and run
mvn install:install-file -Dfile=ipl-2.3-standalone.jar -DgroupId=org.openpixi.pixi -DartifactId=ipl -Dversion=2.3 -Dpackaging=jar
If you are developing under eclipse you might have some errors in the pom file which needs to be resolved first. If you are experiencing errors concerning build-helper-maven-plugin (the full message of the error is: "Plugin execution not covered by lifecycle configuration: org.codehaus.mojo:build-helper-maven-plugin"), you need to hover over the words underlined with the red error marker and eclipse should offer you a quickfix to install "m2e connector for build-helper-maven-plugin".
By default the compilation of the distributed version is turned off in the pom file. The distributed version is only compiled under maven profile distributed. More about maven profiles can be found on the site of maven project.
To use the distributed profile from command line run
mvn compile -P distributed
To use the distributed profile from eclipse go to Project -> Properties -> Maven
and into the field Active Maven Profiles
write distributed. Afterwards, eclipse will automatically compile the sources of the distributed version.
To run the distributed version of pixi one has to first start IBIS IPL server. It can be easily done by running the ipl-server
script which is located in the script directory - openpixi/pixi/scripts
. The ipl server utility allows the different calculating nodes to get to know about each other. From time to time if there was some error in the middle of the calculation and some nodes did not disconnect from the IBIS IPL server it is safer to kill the server and restart it so that there are no hanging nodes.
After the ipl server is running, one needs to start pixi either in several terminals on one computer or on a cluster with several computers. To run distributed pixi one needs to use the following command from the openpixi/pixi
directory
./scripts/run <numOfNodes> <iplServer>
where
-
numOfNodes
is number of nodes taking part in the distributed calculation. (This is not necessarily the number of computers as you can easily start multiple processes on one computer. Consequently, it is the total number of distributed pixi processes.) -
iplServer
: address of the computer running the ipl server utility
If you would like to try the distributed version on your computer, you can do so by executing the following steps
- start three terminals and navigate to the directory
openpixi/pixi
- run
ipl-server
in one of the terminals - in the remaining two terminals run the script
./scripts/run 2 localhost
. Alternatively, you can also run it from your IDE by running the main class:org.openpixi.pixi.distributed.ui.MainProfile -numOfNodes 2 -iplServer localhost
two times.
First of all, you will need to set maven and java home variables in your .bashrc file. You can set them for example as follows
# Verify the used paths on your own
export M2_HOME=/opt/sw/maven2
export M2=$M2_HOME/bin
export PATH=$M2:$PATH
export JAVA_HOME=/usr/lib/jvm/java
export PATH=$JAVA_HOME/bin:$PATH
For the variables to take effect you have to logout and login again. Secondly, one has to install IBIS framework as describe in the previous section. Then, one has to download pixi through git clone git://github.com/openpixi/openpixi.git
and compile it in the command line (for example with the script compile
). Finally, to run the application on the VSC cluster you have to first start the ipl-server
script and afterwards run the following command from openpixi/pixi
directory (on VSC 2 run vsc2-run
with the same parameters)
./scripts/vsc-run <numOfHosts> <numOfProcesses>
where
-
numOfHosts
is the number of computers you would like to use -
numOfProcesses
is the number of processes you would like to start on each node
Following the restrictions at the beginning the numOfHosts * numOfProcesses
has to be power of two.
For example, running vsc-run 2 4
would start 8 pixi processes, 4 on each of the 2 host computers.
The results are collected in files named out.HOSTS.PROCESSES.JOB_ID.HOSTNAME.PROCESS
where
-
HOSTS
is number of used computers -
PROCESSES
is number of OpenPixi processes started at each computer -
JOB_ID
is id of our job assigned to us by VSC -
HOSTNAME
is the name of the computer on which the calculation took place -
PROCESS
distinguishes among the multiple processes we started at one computer
If you want to run OpenPixi on a different cluster than VSC 1 or VSC 2, you will most probably need to adjust the variables used in scripts vsc-run
and vsc-distribute
to match the set up of the cluster in question.
All the tests of the distributed simulation compare the results with the non distributed simulation. There are two possible ways how to run the tests
-
Run the class
org.openpixi.pixi.distributed.TrueDistSimTest -numOfNodes NUMBER -iplServer SERVER
fromNUMBER
of terminals or from VSC by modifying the main class inrun
script. (You also have to start the ipl server.) -
Run the class
org.openpixi.pixi.distributed.ComplexDistSimTest
without any parameters which will run the distributed simulation utilizing threads. The advantage of this test is that you can run it comfortably from your IDE. (You are not expected to start the ipl server, it will be started automatically by the test.)
The above tests test the distributed version under various different settings. If you would like to test the application under your specific settings, you can modify the settings specified in class org.openpixi.pixi.distributed.ComplexDistSimTest
and afterwards run it.
You can profile (collect time measurements) of the distributed version similarly as in the non distributed version. The only difference is that at the end of the simulation you get the measurements from class DistributedProfileInfo
which adds times specific for the distributed version such as network waiting times. The measurements are automatically displayed if you run the class MainProfile
.
Similarly, as with the non distributed version you can get useful particle movement information by compiling the application with "aspectj-debug" profile. The distributed version adds information about particles which are exchanged among neighboring nodes.