Queuing System - calab-ntu/gpu-cluster GitHub Wiki

Mapping Queues to Nodes

  1. Edit /var/spool/TORQUE/server_priv/nodes to assign names to the target subset of nodes --> For example: eureka32 np=16 queue_name1 queue_name2
  2. Show available queues: qstat -Qf
  3. systemctl restart pbs (or service pbs restart on an old system) --> It's OK to do it when there are jobs in existing queues --> It will take a few minutes to complete restart
  4. Ref: http://www.nordugrid.org/documents/pbs-config.html http://docs.adaptivecomputing.com/torque/4-1-4/help.htm#topics/4-serverPolicies/queueConfig.htm

Creating an New Queue

An example adding a queue named titanxq

qmgr -c 'create queue titanxq'
qmgr -c 'set queue titanxq queue_type = execution'
qmgr -c 'set queue titanxq started = true'
qmgr -c 'set queue titanxq enabled = true'
qmgr -c 'set queue titanxq resources_default.walltime = 1:00:00'
qmgr -c 'set queue titanxq resources_default.nodes = 1'
qmgr -c 'set queue titanxq resources_default.neednodes = titanxq'

Deleting an Existing Queue

qmgr -c 'delete queue testq'

Reading submit history

/var/spool/TORQUE/server_logs

Links