-
Command used (always leave it like this):
-
Defines the partition which may be used to execute this job (Always keep public).
-
Defines the job name - it can be any name you want
-
Defines the output file name. %j will add the JOBID to the output_name.o file
#SBATCH -o output_name.o%j
-
Defines the error file name. %j will add the JOBID to the error_name.o file
#SBATCH -e error_name.e%j
-
Defines the Quality of Service (QOS) the job will be executed. Options to use: (debug, general, large)
-
debug - 2 hour and 2 node limit with high priority
-
general - 72 hour and 560 CPU cores limit with default partition
-
large - limit to 20 nodes, unlimited hours, allow exclusive jobs and low priority
-
Sets the job to be exclusive not allowing other jobs to share the compute node. This is required for all large QOS submissions.
-
Sets up the WallTime Limit for the job in hh:mm:ss.
-
Defines the total number CPU cores:
-
Defines the number of compute nodes requested:
-
Defines the number of tasks per node:
#SBATCH --ntasks-per-node 28
-
Requests the c6320 compute nodes. (Also can request r420, r720, and r730 compute nodes). More Info
-
Sets up email notification.
-
#SBATCH --mail-type=begin
Commands/process to execute on compute node:
-
- To see all modules available, type this in your terminal (NOT the job file):
- To see all loaded modules, type this in your terminal (NOT the job file):
- Remove module, type this in your terminal (NOT the job file):
- Remove all modules, type this in your terminal (NOT the job file):
-
Copy any files from home directory to scratch or change directory if needed:
cp /home/$USER/file_name /storage/scratch2/euid123/file_name
cd /storage/scratch2/euid123/
-
Copy any files back to home directory if needed:
cp /storage/scratch2/euid123/file_name /home/$USER/file_name
There is no need to use all commands. Your job file can look as simple as:
#SBATCH -J my_example_job
#SBATCH -o ./out/example_job.o%j
#SBATCH -e my_example_job.e%j
#SBATCH -p public
#SBATCH --qos general
#SBATCH -N 1
#SBATCH -n 1
module load python
python test.py
$ sbatch job_template.job
$ scontrol show job $JOB_ID
Kill a job. Users can kill their own jobs, root can kill any job.
$ scontrol release $JOB_ID