Queuing System - calab-ntu/gpu-cluster GitHub Wiki
Mapping Queues to Nodes
- Edit
/var/spool/TORQUE/server_priv/nodes
to assign names to the target subset of nodes --> For example:eureka32 np=16 queue_name1 queue_name2
- Show available queues:
qstat -Qf
systemctl restart pbs
(orservice pbs restart
on an old system) --> It's OK to do it when there are jobs in existing queues --> It will take a few minutes to complete restart- Ref: http://www.nordugrid.org/documents/pbs-config.html http://docs.adaptivecomputing.com/torque/4-1-4/help.htm#topics/4-serverPolicies/queueConfig.htm
Creating an New Queue
An example adding a queue named
titanxq
qmgr -c 'create queue titanxq'
qmgr -c 'set queue titanxq queue_type = execution'
qmgr -c 'set queue titanxq started = true'
qmgr -c 'set queue titanxq enabled = true'
qmgr -c 'set queue titanxq resources_default.walltime = 1:00:00'
qmgr -c 'set queue titanxq resources_default.nodes = 1'
qmgr -c 'set queue titanxq resources_default.neednodes = titanxq'
Deleting an Existing Queue
qmgr -c 'delete queue testq'
Reading submit history
/var/spool/TORQUE/server_logs