Simple SLURM Job - JoshLoecker/MAPT GitHub Wiki

Below is a very simple SLURM script, that performs the following

  1. #SBATCH a. --job-name: Defines a job name of "Simple Job" b. -p: Defines the partition as Debug. A list of available partitions and their capabilities can be seen here c. -N: Defines the number of nodes in this job d. -n: Defines the number of threads per node in this job.

    1. -N 1 -n 2 is two threads on one node
    2. -N 2 -n 2 is two threads split between two nodes

    e. -t: The time length for your job. This is in the format of hh:mm:ss f. --mail-user: An email that you would like to receive notifications g. --mail-type: What emails you would like to receive.

    1. Possible types include: BEGIN, END, FAIL, TIME_LIMIT, TIME_LIMIT_##
    2. BEGIN: Email at start of job
    3. END: Email at end of job
    4. FAIL: Email if job fails
    5. TIME_LIMIT: Email at time limit
    6. TIME_LIMIT_##: Email when time limit reaches ##% of total time

    h. -o, -e: Where output from the job should reside. Defining the same name for both of these will place output (-o) and errors (-e) in the same file

    1. %%: The character "%".
    2. %A: master job allocation number.
    3. %a: Job array ID (index) number.
    4. %J: jobid.stepid of the running job. (e.g. "128.0")
    5. %j: jobid of the running job.
    6. %N: short hostname. This will create a separate IO file per node.
    7. %n: Node identifier relative to current job (e.g. "0" is the first node of the running job) This will create a separate IO file # per node.
    8. %s: stepid of the running job.
    9. %t: task identifier (rank) relative to current job. This will create a separate IO file per task.
    10. %u: User name.
    11. %x: Job name.

Using the above information with the below SLURM file, the following is occurring.
We are submitting a job with the name "Simple Job" (-n), on the debug node (-p). It will have 1 node (-N) with 1 thread (-n). The time length for this is 30 seconds (-t 00:00:30). No user will be notified when this job starts, ends, fails, etc. (--mail-user). Output (-o) and errors (-e) will be placed in the same file, simple_job_output
The command we are running is date, which simply shows the current date/time.

Any commands that can be ran on the command line can be used in a SLURM file.

If you would like to test this script for yourself, simply copy and paste the following code-chunk into a file (with any name) on SciNet. Submit the job with sbatch FILE_NAME

#!/bin/bash
#SBATCH --job-name="Simple Job"
#SBATCH -p debug
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -t 00:00:30
#SBATCH --mail-user=""
#SBATCH --mail-type=NONE
#SBATCH -o "simple_job_output"
#SBATCH -e "simple_job_output"

date