Configuration - PacificBiosciences/pypeFLOW GitHub Wiki
Job submission options (pypeflow>=2.0.0)
You are probably using pypeFLOW via FALCON. You can learn about FALCON pypeflow-configuration in the FALCON wiki.
The most important configuration is for job-distribution, via one of our "Process-Watchers".
Here, we provide some useful job-submission strings for various grid-computing systems.
"blocking" process-watcher
This process-watcher is the easiest to configure, by far. You have full control over how jobs are submitted, via the submit string.
The following variables will be substituted into your string (based on conventions in PacBio's pbsmrtpipe):
${JOB_SCRIPT}-- the shell command to run (akaCMD)${JOB_NAME}-- job-name selected by pypeflow (not the id generated by qsub, e.g.)${JOB_STDOUT}-- path to write stdout (akaSTDOUT_FILE)${JOB_STDERR}-- path to write stderr (akaSTDERR_FILE)${NPROC}-- number of processors per job${MB}-- maximum MegaBytes of RAM per processor (total is$(expr ${NPROC} * ${MB}))
local
To force everything to run locally, just "submit" to "bash":
pwatcher_type = blocking
submit = /bin/bash -c "${JOB_SCRIPT}" > "${JOB_STDOUT}" 2> "${JOB_STDERR}"
Or to combine everything into the top stderr/stdout:
pwatcher_type = blocking
job_queue = bash -C ${CMD}
# By dropping STD*_FILE, we see all output on the console.
# That helps debugging in TravisCI/Bamboo.
SGE/qsub
This should be familiar to anyone who uses qsub regularly.
submit = qsub -S /bin/bash -sync y -V -q myqueue
-N ${JOB_NAME} \
-o "${JOB_STDOUT}" \
-e "${JOB_STDERR}" \
-pe smp ${NPROC} \
-l h_vmem=${MB}M \
"${JOB_SCRIPT}"
The -sync y makes it a blocking call.
(Note: -l h_vmem= and -l mem= are problematic on some systems. YMMV.)
hermit
submit = hermit qsub
-N ${JOB_NAME} \
-l nprocs=${NPROC}:mem=${MB} \
-v ${env} \
${JOB_SCRIPT}
LSF
submit = bsub -K -q myqueue -J ${JOB_NAME} -o ${JOB_STDOUT} -e ${JOB_STDERR} ${JOB_SCRIPT}
The -K will make it a blocking call.
PBS
Include -W block=T or -W block=true
(If you cannot use blocking mode, then you might be able to rely on qdel ${JOB_NUM}, which we will try to fill-in from the result of qsub. This is experimental, as we cannot test PBS ourselves.)
Slurm/sbatch
Try using srun instead of sbatch.
submit = srun --wait=0 -p myqueue \
-J ${JOB_NAME} \
-o ${JOB_STDOUT} \
-e ${JOB_STDERR} \
--mem-per-cpu=${MB}M \
--cpus-per-task=${NPROC} \
${JOB_SCRIPT}
Other possible flags (and maybe via sbatch):
--time=3-0
--ntasks 1 --exclusive
Other configurables
Running on a local disk
use_tmpdir = true
# Or if you want a specific root directory,
use_tmpdir=/scratch
But note that running in a tmpdir might hinder your debugging. It is better to do this only as an optimization, after all else is working.