Getting Started with Distributed Version - PetcuAlexandru/openpixi GitHub Wiki

This manual is intended for a more experienced user who wants to build the distributed version of OpenPixi from sources. A basic knowledge or willingness to learn maven is of a great advantage. A short overview of the sections in this document follows

Restrictions

Currently, there are few restrictions to the distributed version; namely, the number of grid cells in x and y direction as well as the number of nodes among which the computation is distributed have to be powers of two.

How do I compile the distributed version?

First of all, you need to install IBIS framework which is used for the communication. The IBIS framework is available at the download page of openpixi project. To download and unzip the package on a Linux machine you can run the following commands from terminal

wget --no-check-certificate https://github.com/downloads/openpixi/openpixi/ipl-standalone.zip
unzip ipl-standalone.zip

To install IBIS navigate to the same directory where ipl-2.3-standalone.jar is and run

mvn install:install-file -Dfile=ipl-2.3-standalone.jar -DgroupId=org.openpixi.pixi -DartifactId=ipl -Dversion=2.3 -Dpackaging=jar

If you are developing under eclipse you might have some errors in the pom file which needs to be resolved first. If you are experiencing errors concerning build-helper-maven-plugin (the full message of the error is: "Plugin execution not covered by lifecycle configuration: org.codehaus.mojo:build-helper-maven-plugin"), you need to hover over the words underlined with the red error marker and eclipse should offer you a quickfix to install "m2e connector for build-helper-maven-plugin".

By default the compilation of the distributed version is turned off in the pom file. The distributed version is only compiled under maven profile distributed. More about maven profiles can be found on the site of maven project.

To use the distributed profile from command line run

mvn compile -P distributed

To use the distributed profile from eclipse go to Project -> Properties -> Maven and into the field Active Maven Profiles write distributed. Afterwards, eclipse will automatically compile the sources of the distributed version.

How do I run the distributed version?

To run the distributed version of pixi one has to first start IBIS IPL server. It can be easily done by running the ipl-server script which is located in the script directory - openpixi/pixi/scripts. The ipl server utility allows the different calculating nodes to get to know about each other. From time to time if there was some error in the middle of the calculation and some nodes did not disconnect from the IBIS IPL server it is safer to kill the server and restart it so that there are no hanging nodes.

After the ipl server is running, one needs to start pixi either in several terminals on one computer or on a cluster with several computers. To run distributed pixi one needs to use the following command from the openpixi/pixi directory

./scripts/run <numOfNodes> <iplServer>

where

  • numOfNodes is number of nodes taking part in the distributed calculation. (This is not necessarily the number of computers as you can easily start multiple processes on one computer. Consequently, it is the total number of distributed pixi processes.)
  • iplServer: address of the computer running the ipl server utility

A) How do I run the distributed version on a single computer?

If you would like to try the distributed version on your computer, you can do so by executing the following steps

  • start three terminals and navigate to the directory openpixi/pixi
  • run ipl-server in one of the terminals
  • in the remaining two terminals run the script ./scripts/run 2 localhost. Alternatively, you can also run it from your IDE by running the main class: org.openpixi.pixi.distributed.ui.MainProfile -numOfNodes 2 -iplServer localhost two times.

B) How do I run the distributed version on a VSC cluster?

First of all, you will need to set maven and java home variables in your .bashrc file. You can set them for example as follows

# Verify the used paths on your own
export M2_HOME=/opt/sw/maven2
export M2=$M2_HOME/bin
export PATH=$M2:$PATH
export JAVA_HOME=/usr/lib/jvm/java
export PATH=$JAVA_HOME/bin:$PATH

For the variables to take effect you have to logout and login again. Secondly, one has to install IBIS framework as describe in the previous section. Then, one has to download pixi through git clone git://github.com/openpixi/openpixi.git and compile it in the command line (for example with the script compile). Finally, to run the application on the VSC cluster you have to first start the ipl-server script and afterwards run the following command from openpixi/pixi directory (on VSC 2 run vsc2-run with the same parameters)

./scripts/vsc-run <numOfHosts> <numOfProcesses>

where

  • numOfHosts is the number of computers you would like to use
  • numOfProcesses is the number of processes you would like to start on each node

Following the restrictions at the beginning the numOfHosts * numOfProcesses has to be power of two.

For example, running vsc-run 2 4 would start 8 pixi processes, 4 on each of the 2 host computers.

The results are collected in files named out.HOSTS.PROCESSES.JOB_ID.HOSTNAME.PROCESS where

  • HOSTS is number of used computers
  • PROCESSES is number of OpenPixi processes started at each computer
  • JOB_ID is id of our job assigned to us by VSC
  • HOSTNAME is the name of the computer on which the calculation took place
  • PROCESS distinguishes among the multiple processes we started at one computer

If you want to run OpenPixi on a different cluster than VSC 1 or VSC 2, you will most probably need to adjust the variables used in scripts vsc-run and vsc-distribute to match the set up of the cluster in question.

How do I test the distributed version?

All the tests of the distributed simulation compare the results with the non distributed simulation. There are two possible ways how to run the tests

  1. Run the class org.openpixi.pixi.distributed.TrueDistSimTest -numOfNodes NUMBER -iplServer SERVER from NUMBER of terminals or from VSC by modifying the main class in run script. (You also have to start the ipl server.)

  2. Run the class org.openpixi.pixi.distributed.ComplexDistSimTest without any parameters which will run the distributed simulation utilizing threads. The advantage of this test is that you can run it comfortably from your IDE. (You are not expected to start the ipl server, it will be started automatically by the test.)

The above tests test the distributed version under various different settings. If you would like to test the application under your specific settings, you can modify the settings specified in class org.openpixi.pixi.distributed.ComplexDistSimTest and afterwards run it.

How do I profile the distributed version?

You can profile (collect time measurements) of the distributed version similarly as in the non distributed version. The only difference is that at the end of the simulation you get the measurements from class DistributedProfileInfo which adds times specific for the distributed version such as network waiting times. The measurements are automatically displayed if you run the class MainProfile.

How do I trace the particle movement?

Similarly, as with the non distributed version you can get useful particle movement information by compiling the application with "aspectj-debug" profile. The distributed version adds information about particles which are exchanged among neighboring nodes.

⚠️ **GitHub.com Fallback** ⚠️