Building WindNinja with Docker and Using it in a HPC Environment - firelab/windninja GitHub Wiki

Creating a WindNinja Docker Image

Get WindNinja Source

Clone the WindNinja source code to your local machine.

git clone https://github.com/firelab/windninja.git 

Install Docker

Install Docker on your system.

sudo apt install docker.io

Build the WindNinja Docker Image

Note: The most up to date dockerfile is found in the master branch here. Additionally, If you do not need OpenFOAM, remove lines 82-97.

Navigate to the source directory and build the docker image. This will build the Docker image with the tag windninja:latest. See Dockerfile for more.

cd ~/src/wind/windninja 
docker build -t windninja:latest .   

After the build is complete, confirm the image is available. It will be listed as windninja.

sudo docker images

Run the WindNinja Docker Image

Note: While the CLI works, the GUI will require additional configuration as it needs access to a display.

Run the docker image with bash enabled.

sudo docker run -it windninja /bin/bash

Ensure the CLI was built properly. For more information on using the CLI, see here.

WindNinja_cli

Running WindNinja on an HPC Cluster

Build Singularity Image

Convert the Docker image to a Singularity image

singularity build windninja_latest.sif docker-daemon://windninja:latest

Example Slurm Script (windninja_minimal.sbatch)

This script dynamically handles the task distribution using a loop inside srun.

#!/bin/bash
#SBATCH --job-name=WindNinja_Minimal
#SBATCH --output=windninja_%j.log
#SBATCH --error=windninja_%j.err
#SBATCH --nodes=4
#SBATCH --ntasks-per-node=4
#SBATCH --cpus-per-task=10
#SBATCH --mem-per-cpu=2000

### === USER-DEFINED PATHS === ###
WINDNINJA_SIF="/mnt/store2/temp/windninja_latest.sif"    # Path to .sif container
SHARED_STORAGE="/mnt/store2/temp/CONUS"                  # Input/Output directory (shared across nodes)
LOCAL_DIR="/data"                                        # Temporary working directory (local to node)

### === CREATE LOCAL DIRS ON ALL NODES === ###
srun --exclusive mkdir -p ${LOCAL_DIR}
srun --exclusive rsync -av --progress ${WINDNINJA_SIF} ${LOCAL_DIR}/
srun --exclusive bash -c "rsync -av --progress ${SHARED_STORAGE}/run.sh ${LOCAL_DIR}/"
srun --exclusive bash -c "rsync -av --progress ${SHARED_STORAGE}/base_cli.cfg ${LOCAL_DIR}/"
echo ".sif file and scripts copied to all compute nodes."

### === DEFINE WHICH FOLDERS TO RUN === ###
task_queue=()
FOLDERS=(0 1 2 3)  # <<< USER: Update this list based on folder names

for folder in "${FOLDERS[@]}"; do
    if [ -d "${SHARED_STORAGE}/${folder}" ]; then
        task_queue+=("$folder")
    fi
done

total_jobs=${#task_queue[@]}
echo "Total folders to process: $total_jobs"

### === PROCESS TASK QUEUE === ###
running_jobs=0
job_index=0

while [ $job_index -lt $total_jobs ] || [ $running_jobs -gt 0 ]; do
    if [ $running_jobs -lt $((SLURM_NNODES * 4)) ] && [ $job_index -lt $total_jobs ]; then
        echo "Launching jobs..."
        for i in {0..3}; do
            if [ $job_index -lt $total_jobs ]; then
                folder="${task_queue[$job_index]}"
                job_index=$((job_index + 1))

                srun --exclusive -N1 -n1 bash -c "
                    cp -r \"${SHARED_STORAGE}/${folder}\" \"${LOCAL_DIR}/\" &&
                    singularity exec -B ${LOCAL_DIR}/${folder}:/output ${LOCAL_DIR}/windninja_latest.sif ${LOCAL_DIR}/run.sh ${folder}
                " &
                ((running_jobs+=1))
            fi
        done
    else
        wait -n
        ((running_jobs-=1))
    fi
done
wait
echo "All tasks completed."

To run this script, use the command:

sbatch windninja_minimal.sbatch

Note: For each WindNinja run, 9 threads are set in the configuration file (base_cli.cfg). If your compute node has 40 cores, you can run 4 simulations in parallel, as each simulation requires 10 cores (1 simulation = 9 threads + 1 for SLURM overhead). Thus, for a node with 40 cores, the maximum number of simulations that can be run simultaneously is: 40 cores รท 10 cores per simulation = 4 simulations Make sure to adjust the number of tasks per node accordingly based on the available cores in your system.

Example Run Script (run.sh)

We need retain all SLURM environment variables and only unset specific ones if needed. For example, openFOAM will not work if the environment variables are not set each run.

#!/bin/bash

### === USER-DEFINED VARIABLES === ###
FOLDER=$1
OUTPUT_FOLDER="/output"                          # << Do not change
CONFIG_FILE="/data/base_cli.cfg"                 # << if your LOCAL_DIR="/data" in sbatch file
LOG_FILE="${OUTPUT_FOLDER}/simulation.log"

### === ENV SETUP (FOR MOMENTUM SOLVER) === ###
export CPL_DEBUG=NINJAFOAM
source /opt/openfoam8/etc/bashrc
export OMPI_ALLOW_RUN_AS_ROOT=1
export OMPI_ALLOW_RUN_AS_ROOT_CONFIRM=1
export FOAM_USER_LIBBIN=/usr/local/lib/

### === TEMP UNSET SLURM ENV TO AVOID FOAM CONFLICT === ###
SLURM_ENV=$(env | grep ^SLURM_)
unset $(env | grep ^SLURM_ | cut -d= -f1)

echo "Starting WindNinja CLI run for folder ${FOLDER}..."
WindNinja_cli ${CONFIG_FILE} > ${LOG_FILE} 2>&1
EXIT_CODE=$?

export ${SLURM_ENV}

if [ $EXIT_CODE -eq 0 ]; then
    echo "WindNinja CLI run completed for folder ${FOLDER}."
else
    echo "WindNinja CLI run failed for folder ${FOLDER} with exit code $EXIT_CODE."
    exit 1
fi

### === SYNC AND CLEANUP === ###
start_time=$(date +%s)
cp -r "${LOCAL_DIR}/${FOLDER}" "${SHARED_STORAGE}/"
cp -r "${LOCAL_DIR}/${FOLDER}/simulation.log" "${SHARED_STORAGE}/${FOLDER}/simulation.log"

end_time=$(date +%s)
echo "Execution time = $((end_time - start_time)) seconds"
rm -rf "${LOCAL_DIR}/${FOLDER}"

Example Configuration File (base_cli.cfg)

This is a basic CLI configuration file for WindNinja.

# for the momentum solver, need to uncomment out momentum_flag. And do the following:
# otherwise, set any non $ symbol value to whatever you want, the $ symbol values will be replaced by the custom python script

num_threads                = 9
momentum_flag              = true
elevation_file             = /output/dem0.tif
initialization_method      = domainAverageInitialization
input_speed                = 5.0
input_speed_units          = mph
input_direction            = 0
input_wind_height          = 10.0
units_input_wind_height    = m
output_wind_height         = 10.0
units_output_wind_height   = m
mesh_resolution            = 120.0
units_mesh_resolution      = m
write_ascii_output         = true
output_path                = /output