Other Common Analysis Tools - NSBLab/MATLAB-connectome-intro GitHub Wiki

You’ll probably have to learn all of these, but they can be learnt more gradually/on an as-needed basis

FreeSurfer FSL SLURM and Job Submission
Principles and Philosophy The SLURM Workload Manager coordinates the activity and scheduling of jobs among the cluster to ensure that resources are allocated efficiently across each node. These jobs can range from simple, short commands to more complex, lengthy commands.

Sometimes, you might have a particular command or shell script that you want to run without needing a desktop open (e.g., you want it to run overnight, you want to run multiple jobs at once, or you just need more power). To do this, we can send shell scripts as batch jobs directly to the cluster via SLURM.

Introduction
  • FS wiki has many tutorials for the different commands in FreeSurfer
  • The most important command is recon-all which manages the full cortical surface reconstruction pipeline
    • This can be broken down into autorecon1, autorecon2, and autorecon3 depending on which stage of the pipeline you want to run (or re-run), but recon-all -all will run everything too
  • FreeView
    • Visualisation software for FreeSurfer surfaces and outputs
    • Requires ‘vglrun’ before any commands when using the command line interface
    • Manual edits to cortical surfaces can be conducted within FreeSurfer through the edit mode in FreeView
  • The FSL wiki has many tutorials for all the different tools within the library
  • Key tools include:
    • fMRI - FEAT, MELODIC, FABBER, BASIL
    • sMRI - BET, FAST, FIRST, FLIRT, FNIRT
    • dMRI - FDT, TBSS, EDDY, TOPUP
  • Other important software includes FIX and AROMA
    • These aren’t currently included in FSL because they have other dependencies, but they are installed as modules on MASSIVE
  • MSM is an important tool used in cortical registration
    • Vanilla MSM is included in FSL, but MSM_HOCR (a revised version of MSM) is not currently included in FSL and needs to be loaded as a separate module
  • FSLeyes
    • Visualisation software for FSL and general neuroimaging outputs
    • Required when inspecting FEAT/MELODIC ICA components for FSL-FIX
  • HCP pipelines
    • A commonly used series of neuroimaging pipelines built around many functions within wb_command and FSL
The command ‘sbatch <your_job.sh>’ will submit the script titled <your_job.sh> as a batch job to SLURM. The command ‘squeue --user $USER’ will display the current job queue with the status of your running and pending jobs.

A basic introduction and list of SLURM commands can be found here.

Providing certain arguments within your job script can also help SLURM allocate the job. These should be placed at the start of the shell script as ‘#SBATCH --[argument]’. A full list of SBATCH options can be found here, but below are a few common ones:

#SBATCH --account=

  • This manages which project the job is submitted to. We use kg98

    #SBATCH --job-name=

    • Naming your jobs can be helpful if you’re submitting multiple jobs at once that do different things

      #SBATCH --time=

      • The maximum time your job will run. The job will end normally if your script finishes in less time than specified, but the job will be killed if it goes beyond this time

        #SBATCH --ntasks=

        #SBATCH --cpus-per-task=

        #SBATCH --mem-per-cpu=

        • These options control how much memory and how many resources are allocated to your job. Higher values can make your job spend more time in the queue, but not requesting enough can make your job fail due to a segmentation fault and not enough available memory.

          #SBATCH --mail-user=

          #SBATCH --mail-type=

          • These options will notify you by email when your job successfully finishes or if your job fails
Advanced techniques Sometimes you might want to run a particular job across each subject within a dataset. While you can do this with a for loop in bash, these subjects will be running serially (one at a time in a sequence), which can be slow in big samples. Instead, you can parallelise these jobs to run together through job arrays.
  • This can be done in SBATCH with ‘#SBATCH --array=’ where you specify the number of jobs within your array
    • e.g., #SBATCH --array=1-11 if you have 11 subjects to include in your array
  • Next, include a text file containing all of your subject IDs
    • subject_list="/path/to/sublist.txt"
  • Finally, allocate each string in sublist.txt to a job within your array
    • #SLURM_ARRAY_TASK_ID=1
    • subject=$(sed -n "${SLURM_ARRAY_TASK_ID}p" ${subject_list})

      This will function the same as a performing a for loop on your sublist file (e.g., for subject in `cat /path/to/sublist.txt`), however, submitting each subject as a separate batch job will run much faster in most instances

Other notes Only 1000 jobs per user can be submitted at any one time (including currently pending or running jobs and desktops)
⚠️ **GitHub.com Fallback** ⚠️