Running HPX on Hermione - STEllAR-GROUP/hpx GitHub Wiki

Available Software

Hermione comes with a set of already preinstalled software to be used with HPX. Most notably Boost and hwloc. You will find the installed libraries and headers in the default system directories or under /opt.

Running

In order to manage multiple users, Hermione comes with the SLURM batch scheduler. Hermione is a heterogeneous cluster with a total of 39 compute nodes. In order to give convenient access to the different nodes, the following partitions exist:

primary: All compute nodes
tycho: Special purpose node with a K20 and Intel Xeon Phi 7120P accelerator
ariel: Two nodes with two Intel Xeon CPU E5-2690s (16 cores in total) with 64 GB of RAM each
beowulf: 16 nodes with two Intel Xeon CPU X3430s (4 cores in total) with 12 GB of RAM each
lyra: Two nodes with eight AMD Opteron Processor 8431s (48 cores in total) with 96 GB of RAM each
marvin: 16 nodes (marvin00-marvin15) with two Intel Xeon CPU E5-2450s (16 cores in total) with 48 GB of RAM each. One half of these nodes have hyper-threading turned on. To select one setup for hyper-threading use the following marv_ht or marv_noht partitions.
marv_ht: 8 of the marvin nodes with hyper-threading on (marvin00-marvin07)
marv_noht: 8 of the marvin nodes with hyper-threading off (marvin08-marvin15)
trillian: Two nodes with four AMD Opteron Processor 6272s (64 cores in total) with 128 GB of RAM each
carson, reno: Two nodes with two Intel Ivybridge CPUs (20 cores in total) with 128 GB of RAM each; reno has a NVidia K40 and an Intel Xeon Phi 7120P: carson has 2 Intel Xeon Phi 7120P accelerators installed. Hyper-threading is on for the reno and off for carson.

For more information about the various node types, please see visit http://stellar.cct.lsu.edu/resources/hermione-cluster/

Running HPX applications on the compute nodes

Running HPX applications on Hermione can be done by using the srun command. In order to run a HPX application you can use the following command:

$ srun -p <node-type> -N <number-of-nodes> hpx-application

Where <node-type> is one of the above mentioned partitions and <number-of-nodes> is the number of compute nodes you want to use. By default, the HPX application is started with one locality per node and uses all available cores on a node. You can change the number of localities started per node (for example to account for NUMA effects) by specifying the -n option of srun. We suggest to always supply -n <number-of-instances>, even if the number of instances is equal to one (1). The number of cores per locality can be set by -c.

Examples:

The following examples assume that you successfully built HPX on Hermione and the your current working directory is the HPX install directory.

Running hello_world on one of trillian nodes on all cores:

$ srun -p trillian -N 1 ./bin/hello_world

Running hello_world on two marvin nodes on all cores:

$ srun -p marvin -N 2 ./bin/hello_world

Running hello_world on two marvin nodes, using one locality per NUMA domain:

$ srun -p marvin -N 2 -n 4 -c 8 ./bin/hello_world

Running the MPI version of hello_world on four marvin nodes, using two localities per NUMA domain:

$ salloc -p marvin -N 4 -n 8 -c 8 mpirun ./bin/hello_world

Interactive Shells

To get an interactive development shell on one of the nodes you can issue the following command:

$ srun -p <node-type> -N <number-of-nodes> --pty /bin/bash -l

After the shell has been acquired, you can run your HPX application. By default, it uses all available cores. Note that if you requested one node, you don't need to do srun again. However, if you requested more than one nodes, and want to run your distributed application, you can use srun again to start up the distributed HPX application. It will use the resources that have been requested for the interactive shell.

Note: Do not use the command above for MPI applications. If you need to test something use:

$ salloc -p <node-type> -N <number-of-nodes> bash -l

This command will not open a shell like mpirun would and is still running on the head node until you issue the actual srun or mpirun. Thus, preferably just run your application directly with

$ salloc <slurm-args> mpirun <application-path>    # For MPI applications
$ salloc <slurm-args> srun <application-path>      # For anything else

Scheduling Batch Jobs

The above mentioned method of running HPX applications is fine for development purposes. The disadvantage that comes with srun is that it only returns once the application is finished. This might not be appropriate for longer running applications (for example benchmarks or larger scale simulations). In order to cope with that limitation you can use the sbatch command.

sbatch expects a script that it can run once the requested resources are available. In order to request resources you need to add #SBATCH comments in your script or provide the necessary parameters to sbatch directly. The parameters are the same as with srun. The commands you need to execute are the same you would need to start your application as if you were in an interactive shell.

Example batch script:

example.sbatch:

#!/bin/bash
#SBATCH -p marvin -N 2 
srun -p marvin -N 2 ./bin/hello_world

command to schedule example:

$sbatch example.sbatch

Module files

To handle the different compiler and libraries on Hermione you can use the modules files to set the environment variables for more details see http://modules.sourceforge.net/man/modulefile.html

Usage

module avail - List all available modules
module load gcc/5.2 - Loads the gcc 5.2 compiler in this terminal
module unload gcc/5.2 - Unlod the gcc 5.2 compiler in this terminal
module list - Lists all loaded module in this terminal