Toil - umccr/aws_parallel_cluster GitHub Wiki
Toil
This guide assumes that you have read through the shared file systems page and the slurm page. You may wish to also read through the official toil docs for more information.
In order to save the variables initialised below throughout your shell, you should complete the following through a screen
Create toil directories on the sfs
Create the following directories in your shared filesystem mount (probably /efs) but could also be /fsx. Your 'SHARED_DIR' environment variable should be set from your ~/.bashrc
TOIL_JOB_STORE="${SHARED_DIR}/toil/job-store"
TOIL_WORKDIR="${SHARED_DIR}/toil/workdir"
TOIL_TMPDIR="${SHARED_DIR}/toil/tmpdir"
TOIL_LOG_DIR="${SHARED_DIR}/toil/logs"
TOIL_OUTPUTS="${SHARED_DIR}/toil/outputs"
mkdir -p "${TOIL_JOB_STORE}"
mkdir -p "${TOIL_WORKDIR}"
mkdir -p "${TOIL_TMPDIR}"
mkdir -p "${TOIL_LOG_DIR}"
mkdir -p "${TOIL_OUTPUTS}"
Activate env
You must activate the conda env first, since conda activate
doesn't work in a non-interactive shell.
By default, environment variables are inherited into an sbatch job.
conda activate toil
Running TOIL
%j represents the job id
sbatch --job-name "<name-of-my-workflow>" \
--output "${TOIL_LOG_DIR}/toil.%j.log" \
--error "${TOIL_LOG_DIR}/toil.%j.log" \
--partition "copy-long" \
--no-requeue \
--wrap "\
toil-cwl-runner \
--jobStore \"${TOIL_JOB_STORE}/\${SLURM_JOB_ID}.log\" \
--workDir \"${TOIL_WORKDIR}\" \
--outdir \"${TOIL_OUTPUTS}\" \
--writeLogs \"${TOIL_LOG_DIR}\" \
--batchSystem slurm \
--disableCaching true \
--cleanWorkDir=onSuccess \
\"my-cwl-tool.cwl\" \
\"my-cwl-tool.input.yaml\""