Marenostrum 5 GPU - loganoz/horses3d GitHub Wiki
Compiler:
**NVFORTRAN ** (last checked:01/06/2025)
We need the nvidia SDK and compiler to generate code for the FORTRAN/OpenACC implementation in Horses3D.
List of modules needed to compile and run:
module purge
module load nvidia-hpc-sdk/24.11
module load metis/5.1.0-gcc
Then, to compile the NS (compressible Navier-Stokes) solver:
make ns COMPILER=nvfortran COMM=PARALLEL WITH_METIS=YES
or to compile the MU (multiphase) solver:
make mu COMPILER=nvfortran COMM=PARALLEL WITH_METIS=YES
If a signle GPU run is to be used then the last two options can be ommited.
Slurm scripts
An example slurm script is included below. Two scripts are necessary to use MN5 properly. The first one is to setup the job and the second one is to set the affinity of the GPUs in the node. The queue and account details should be specified in the slurm script for proper submission to MN5:
#!/bin/bash
### Job name on queue
#SBATCH --job-name=MN5_Horses3D_GPU
### Output and error files directory
#SBATCH -D .
### Output and error files
#SBATCH --output=out%j.out
#SBATCH --error=err%j.err
### Run configuration
#SBATCH --ntasks=1 # Number of GPUs to be used
#SBATCH --ntasks-per-node=1 # Up to 4 - If number of ranks > 4
#SBATCH --cpus-per-task=20 # This is the MN5 standard 20 cpus/1 GPU
#SBATCH --time=00:10:00
#SBATCH --gres=gpu:1 # Number of GPUs per node - up to 4
### Queue and account
#SBATCH --qos=acc_ehpc
#SBATCH --account=ehpc175
### MN% modules
module purge
module load nvidia-hpc-sdk/24.11
module load metis/5.1.0-gcc
EXEC= PATH_T0_HORSES_EXECUTABLE
# For parallel runs
mpirun -np 16 --map-by ppr:4:node:PE=20 --report-bindings ./mn5_bind.sh $EXEC CASE_FILE.control
# For serial runs
#srun --unbuffered $EXEC CASE_FILE.control
The mn5_bind.sh script should be defined as follows and controls the affinity of the GPUs:
#!/bin/bash
case ${OMPI_COMM_WORLD_LOCAL_RANK} in
0)
export CUDA_VISIBLE_DEVICES=0
export OMPI_MCA_btl_openib_if_include=mlx5_0:1
export UCX_NET_DEVICES=mlx5_0:1
numactl --membind=0 "$@"
;;
1)
export CUDA_VISIBLE_DEVICES=1
export UCX_NET_DEVICES=mlx5_1:1
export OMPI_MCA_btl_openib_if_include=mlx5_1:1
numactl --membind=1 "$@"
;;
2)
export CUDA_VISIBLE_DEVICES=2
export UCX_NET_DEVICES=mlx5_4:1
export OMPI_MCA_btl_openib_if_include=mlx5_4:1
numactl --membind=2 "$@"
;;
3)
export CUDA_VISIBLE_DEVICES=3
export UCX_NET_DEVICES=mlx5_5:1
export OMPI_MCA_btl_openib_if_include=mlx5_5:1
numactl --membind=3 "$@"
;;
esac
# Found in: https://gitlab.com/bsc_sod2d/sod2d_gitlab/-/wikis/Documentation/Running/mn5-run
To send the job, i.e.:
sbatch run.sh