Running the Pipeline - JoshLoecker/MAPT GitHub Wiki

Two avenues exist for running the pipeline: Interactive Runs and SLURM Jobs

Interactive Runs

Interactive runs are useful if you would like to see the output of the pipeline as it runs. However, if the SSH connection to SciNet fails, the pipeline will exit immediately. Despite this, if you have a short job interactive runs can be useful if you would like to ensure your job is running successfully.

Follow this guide from SciNet for help on how to set up an interactive run A list of available partitions and queues can be found here

In short, the following structure should be used: srun --pty -p [QUEUE_CHOICE] -t hh:mm:ss -n [TASKS] -N [NODES] /bin/bash -l Once you are logged in to a compute node, simply run snakemake -j all in the MAPT directory that was downloaded from GitHub. Ensure you have modified all required parameters

SLURM Jobs

SLURM is the most versatile tool for running the pipeline. They allow for the SSH connection to close, while SLURM handles your jobs

Several SciNet guides are available on how to create a SLURM file

For more SBATCH options, see this guide

Dry Runs

Dry runs can be done in Interactive Runs or SLURM Jobs.
A dry run allows you to see what steps in the pipeline need to be completed, and ensures preliminary configuration is set up correctly
A dry run does not execute any component of the pipeline

To perform a dry run, perform the following steps

  1. Activate your conda environment (Step 4)
  2. Call snakemake with a dry run: snakemake -j 1 -n
    a. snakemake: Start snakemake
    b. -j 1 (or --cores 1): Use 1 core for the dry run. We do not want to bog-down the login node on the cluster. More cores will not speed this process up
    c. -n (or --dry-run): This is the dry run flag for snakemake

This will output a fair amount of information, showing which rules need to be completed. The pipeline can then be ran with SLURM by following the example scripts above

Return to Wiki Homepage
Continue to Updating the Pipeline