Running Subjects on a general HPC cluster - McIntosh-Lab/tvb-ukbb GitHub Wiki

These instructions are a stripped-down version of the Running Subjects on Compute Canada instructions. It may be helpful to have a look at that page for inspiration on configuring the pipeline for your HPC system. Note that you will likely need to customize the pipeline to your dataset.

Loading the Dependencies and Singularity Image

Load your dependencies (i.e. AFNI, FreeSurfer) and your Singularity image into your environment. Again, see the Compute Canada documentation for examples on an existing HPC system. Compute Canada uses the module system to load software packages and dependencies.

If you are using the Cam-CAN_dev_CPU or ADNI3_dev_CPU branches, you should not load AFNI and FreeSurfer. These are already included in the corresponding container.

Submitting a Single Subject

Once the Singularity Image has been loaded, there are three options for submitting a single subject:

Please note that if you'd like to re-process an previously processed subject (including failed processing), you will need to clear your subject's folder so that it only contains the rawdata subdirectory. The exception for this requirement is when reparcellating, in which case the subject folder can be left intact.

1) Running `submit_subject.sh` on a system with SLURM installed

cd to the directory containing your subjects before running your subject with the sbatch command. <subject> should be the name of the subdirectory for your subject (e.g. sub-0001)

cd <path/to/dir/containing/subjects>
sbatch -J <subject> <path/to/submit_subject.sh> <subjDir>

The sbatch command makes use of a submit_subject.sh script. Here is an example submit_subject.sh script (modify file paths accordingly):

#!/bin/bash
#SBATCH --account=<account_name>
#SBATCH --gres=gpu:1
#SBATCH --cpus-per-task=6
#SBATCH --mem=32G
#SBATCH --time=0-9:00
#SBATCH --output=log_%x_%j.o
#SBATCH --error=log_%x_%j.e
	
singularity run --nv -B /scratch -B /cvmfs -B /project <path/to/singularity/image/file> ${1} <path/to/tvb-ukbb_pipeline>

Note: the TVB-UKBB pipeline requires a P100-type GPU to properly run the CUDA-enabled versions of BEDPOSTX, PROBTRACKX, and EDDY. These are upstream requirements due to the way these FSL libraries are implemented. Please check your HPC system's available nodes and their specifications to see if these are available. If they aren't, consider using the batched CPU version instead.

NOTE: You may need to add additional -B <file_path> arguments to include the top-most directory of your /<path/to/singularity/image/file/>. For example, if your /<path/to/singularity/image/file/> is /home/adife/tvb-ukbb_new.sif then you should add a -B /home argument to the singularity run command.

2) Running `singularity run`

This method allows you to directly execute the container on a subject without using SLURM. This will run on the machine you're logged into so be careful if you're just on a login node on your HPC system.

singularity run --nv -B /scratch -B /cvmfs -B /project <path/to/singularity/image/file> <subjDir> <path/to/tvb-ukbb/>

3) Running `bb_pipeline.py` from inside the Singularity Image

Enter the Singularity Shell: This method creates an interactive session where you execute the container as a terminal. From inside the container you can run subjects, test out changes, etc. Interactive mode will likely not complete a full subject if you're running it in a timed live allocation, so keep this in mind.

singularity shell --nv -B /scratch -B /cvmfs -B /project -B ~/.Xauthority <path/to/singularity/image/file>

Note: remove the --nv flag if you do not have or do not plan to use a GPU.

Initialize environment variables with:

. <path/to/tvb-ukbb/init_vars>

cd to the directory containing your subject directory (e.g. subject directory named <subjDir>. <subjDir> must be the name of the subject's directory only - not the full file path to that directory).

Run the subject with

python <path/to/tvb-ukbb>/bb_pipeline_tools/bb_pipeline.py <subjDir>

Batching Multiple Subjects with SLURM

Your best bet for submitting multiple subjects is to use a combination of some sort of for-loop and approach 1), the submit_subject.sh script with SLURM. A script like this:

#!/bin/bash
#
#  This script submits all subject_list.txt subs in the current working directory to be run.
#
#  Usage:   ./batch_subjs.sh  <path>/<to>/subject_list.txt
#  
#

while IFS= read -r subjname; do
	sbatch -J $subjname /<path>/<to>/submission_scripts/submit_subject.sh $subjname		#TO BE MODIFIED BY USER
done < "$1"

would allow you to iterate over a .txt file containing one subject name per line and submit each subject. Be sure that the Singularity container and pipeline command parameters are set correctly. Remember to cd to the directory containing all your subject folders before running the batch script.

Running Subjects on a general HPC cluster - McIntosh-Lab/tvb-ukbb GitHub Wiki

Loading the Dependencies and Singularity Image

Submitting a Single Subject

1) Running submit_subject.sh on a system with SLURM installed

2) Running singularity run

3) Running bb_pipeline.py from inside the Singularity Image

Batching Multiple Subjects with SLURM

⚠️ **GitHub.com Fallback** ⚠️

1) Running `submit_subject.sh` on a system with SLURM installed

2) Running `singularity run`

3) Running `bb_pipeline.py` from inside the Singularity Image

⚠️ GitHub.com Fallback ⚠️