Setting up a new machine (with MPITrampoline) - CliMA/slurm-buildkite GitHub Wiki
Setting up a new machine (with MPITrampoline)
While Clima codes should work with any implementation of MPI, we found that MPITrampoline provides the most stable results on cluster. MPITrampoline is a compatibility layer for MPI, routing your Julia program's MPI calls to the underlying MPI library.
This how-to guide describes how to set up a new machine using MPITrampoline. Note that each machine is unique, so this is a general guide and it might not lead to best performances.
What is needed
- An MPI implementation (e.g., OpenMPI)
- CUDA
- A C/Fortran toolchain
- Julia
Sketch of steps
- If you already have an MPI implementation, check that the implementation is CUDA-aware. If using OpenMPI, you can run
ompi_info | grep -i cuda. If your MPI implementation uses UCX, UCX has to be CUDA-aware too. - If you don't have an MPI implementation, you can grab one and compile it, making sure that you are compiling a CUDA-aware implementation.
- Download MPIwrapper and compile it with your implementation of MPI:
Steps for building MPIWrapper:
1. Clone:
git clone https://github.com/eschnett/MPIwrapper.git
cd MPIwrapper
2. Create and configure the build directory:
mkdir build && cd build
cmake .. -DCMAKE_INSTALL_PREFIX=$HOME/mpiwrapper-install
- You can change CMAKE_INSTALL_PREFIX to any install location.
- By default, cmake will try to detect your system MPI. You can force one with -DMPI_C_COMPILER=/path/to/mpicc if needed.
3. Compile and install:
make -j
make install
This will install into $HOME/mpiwrapper-install (or whatever you set). After installation you should see:
$HOME/mpiwrapper-install/bin/mpiwrapperexec
$HOME/mpiwrapper-install/lib/libmpiwrapper.so (or lib64/)
4. Set environment variables:
where mpi_trampoline_root is the path where you installed MPIwrapper, set:
export JULIA_MPI_HAS_CUDA="true"
export MPITRAMPOLINE_LIB="$mpi_trampoline_root/lib64/libmpiwrapper.so"
export MPITRAMPOLINE_MPIEXEC="$mpi_trampoline_root/bin/mpiwrapperexec"
Add the following to your Julia preferences
[preferences.MPIPreferences]
_format = "1.0"
binary = "MPItrampoline_jll"
This usually goes in ~/.julia/prefs/MPIPreferences.toml, which may need to be created if it doesn't exist.
5. Run and test:
When you use MPITrampoline, you can run your code with
$MPITRAMPOLINE_MPIEXEC -n <np> julia ...
Here is a test snippet that can be run from the command-line:
$MPITRAMPOLINE_MPIEXEC -n 2 julia -e 'using MPI; MPI.Init(); println("Hello from rank $(MPI.Comm_rank(MPI.COMM_WORLD))"); MPI.Finalize()'