Simple SLURM Job - JoshLoecker/MAPT GitHub Wiki
Below is a very simple SLURM script, that performs the following
-
#SBATCH
a.--job-name
: Defines a job name of "Simple Job" b.-p
: Defines the partition as Debug. A list of available partitions and their capabilities can be seen here c.-N
: Defines the number of nodes in this job d.-n
: Defines the number of threads per node in this job.-N 1 -n 2
is two threads on one node-N 2 -n 2
is two threads split between two nodes
e.
-t
: The time length for your job. This is in the format of hh:mm:ss f.--mail-user
: An email that you would like to receive notifications g.--mail-type
: What emails you would like to receive.- Possible types include: BEGIN, END, FAIL, TIME_LIMIT, TIME_LIMIT_##
BEGIN
: Email at start of jobEND
: Email at end of jobFAIL
: Email if job failsTIME_LIMIT
: Email at time limitTIME_LIMIT_##
: Email when time limit reaches##
% of total time
h.
-o
,-e
: Where output from the job should reside. Defining the same name for both of these will place output (-o
) and errors (-e
) in the same file%%
: The character "%".%A
: master job allocation number.%a
: Job array ID (index) number.%J
: jobid.stepid of the running job. (e.g. "128.0")%j
: jobid of the running job.%N
: short hostname. This will create a separate IO file per node.%n
: Node identifier relative to current job (e.g. "0" is the first node of the running job) This will create a separate IO file # per node.%s
: stepid of the running job.%t
: task identifier (rank) relative to current job. This will create a separate IO file per task.%u
: User name.%x
: Job name.
Using the above information with the below SLURM file, the following is occurring.
We are submitting a job with the name "Simple Job" (-n
), on the debug node (-p
). It will have 1 node (-N
) with 1 thread (-n
). The time length for this is 30 seconds (-t 00:00:30
). No user will be notified when this job starts, ends, fails, etc. (--mail-user
). Output (-o
) and errors (-e
) will be placed in the same file, simple_job_output
The command we are running is date
, which simply shows the current date/time.
Any commands that can be ran on the command line can be used in a SLURM file.
If you would like to test this script for yourself, simply copy and paste the following code-chunk into a file (with any name) on SciNet. Submit the job with sbatch FILE_NAME
#!/bin/bash
#SBATCH --job-name="Simple Job"
#SBATCH -p debug
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -t 00:00:30
#SBATCH --mail-user=""
#SBATCH --mail-type=NONE
#SBATCH -o "simple_job_output"
#SBATCH -e "simple_job_output"
date