Setting up a new machine (with MPITrampoline) - CliMA/slurm-buildkite GitHub Wiki

Setting up a new machine (with MPITrampoline)

While Clima codes should work with any implementation of MPI, we found that MPITrampoline provides the most stable results on cluster. MPITrampoline is a compatibility layer for MPI, routing your Julia program's MPI calls to the underlying MPI library.

This how-to guide describes how to set up a new machine using MPITrampoline. Note that each machine is unique, so this is a general guide and it might not lead to best performances.

What is needed

  • An MPI implementation (e.g., OpenMPI)
  • CUDA
  • A C/Fortran toolchain
  • Julia

Sketch of steps

  1. If you already have an MPI implementation, check that the implementation is CUDA-aware. If using OpenMPI, you can run ompi_info | grep -i cuda. If your MPI implementation uses UCX, UCX has to be CUDA-aware too.
  2. If you don't have an MPI implementation, you can grab one and compile it, making sure that you are compiling a CUDA-aware implementation.
  3. Download MPIwrapper and compile it with your implementation of MPI:

Steps for building MPIWrapper:

1. Clone:

git clone https://github.com/eschnett/MPIwrapper.git
cd MPIwrapper

2. Create and configure the build directory:

mkdir build && cd build
cmake .. -DCMAKE_INSTALL_PREFIX=$HOME/mpiwrapper-install
  • You can change CMAKE_INSTALL_PREFIX to any install location.
  • By default, cmake will try to detect your system MPI. You can force one with -DMPI_C_COMPILER=/path/to/mpicc if needed.

3. Compile and install:

make -j
make install

This will install into $HOME/mpiwrapper-install (or whatever you set). After installation you should see:

$HOME/mpiwrapper-install/bin/mpiwrapperexec
$HOME/mpiwrapper-install/lib/libmpiwrapper.so   (or lib64/)

4. Set environment variables:

where mpi_trampoline_root is the path where you installed MPIwrapper, set:

export JULIA_MPI_HAS_CUDA="true"
export MPITRAMPOLINE_LIB="$mpi_trampoline_root/lib64/libmpiwrapper.so"
export MPITRAMPOLINE_MPIEXEC="$mpi_trampoline_root/bin/mpiwrapperexec"

Add the following to your Julia preferences

[preferences.MPIPreferences]
_format = "1.0"
binary = "MPItrampoline_jll"

This usually goes in ~/.julia/prefs/MPIPreferences.toml, which may need to be created if it doesn't exist.

5. Run and test:

When you use MPITrampoline, you can run your code with

$MPITRAMPOLINE_MPIEXEC -n <np> julia ... 

Here is a test snippet that can be run from the command-line:

$MPITRAMPOLINE_MPIEXEC -n 2 julia -e 'using MPI; MPI.Init(); println("Hello from rank $(MPI.Comm_rank(MPI.COMM_WORLD))"); MPI.Finalize()'