slurm and software licenses - raeker/ARC-Wiki-Test GitHub Wiki
Some of the commercial software that is used on the ARCTS clusters has
limits on how many instances can be run at one time and by whom. Most
of those packages use the FLEXlm license manager, which runs on
flux-license1.it.miserver.umich.edu with some cnames defined. If
there are n licenses, then the n+1th person will get an error saying a
license could not be checked out. Since that number is known ahead of
time, Slurm provides a capability to set that limit and track jobs that
request licenses so that the job that would request the n+1th license
will be held until a running job with a license finishes and a license
will be again available.
We also have some software that we either contractually with the company or with ITS agree to limit to n concurrent uses. We will call the former Metered usage software and the latter Unmetered usage software. Software limits are imposed per cluster. So, separate limits are maintained for each of GL, A2, and LH. The majority of nodes in LH have opted out of commercial software use, so virtually no licenses are defined there. That is an issue of some contention within ARCTS and may require some negotiation with COE about how to handle those licenses.
This page attempts to explain how the limits get set for a cluster, how changes get made, and who does what.
This is the simpler case and is a proper subset of Metered usage software, so we will look at it first. As an example, we will use SAS. ARCTS pays ITS for 20 SAS licenses per year, and we allocate half to GL and half to A2.
We tell users to request SAS licenses, and we tell Slurm how many are available for each cluster. This is done in their submission script with
#SBATCH --licenses=sas@slurmdb:1
The usage limit for this software is set once in a file, and the limit
does not change. We trust Slurm will track how many licenses are in
use, how many are waiting, and how many have been used in prior jobs
(perhaps; this is currently speculation). At this time, we do not know
how those initial counts are actually added to Slurm, but we trust the
Systems group to get it right. The limits are currently set in Ansible,
in the file $ANSIBLE_ROOT/group_vars/all/licenses. Once Slurm starts,
the limits are stored in its database and are modified with an
sacctmgr -i modify resource name=$name set count=$total command, where
$name is the license name as registered with Slurm and $total is the
new upper-limit. To display a limit, use
$ scontrol show licenses | grep -A1 $name
LicenseName=sas@slurmdb
Total=10 Used=0 Free=10 Remote=yes
To change a limit, the software team should submit a ticket using
TeamDynamix to the Systems group requesting that the upper limit be
modified in the $ANSIBLE_ROOT/group_vars/all/licenses file and that an
sacctmgr command be issued to adjust the limit in the running Slurm.
Once the change has been made, the above command should be used to
verify the change to the active Slurm, and the requester should verify
that the new limit has been entered into the
/etc/ansible/group_vars/all/licenses file in the Git master branch
on flux-admin09.
Examples of Unmetered usage software are: SAS, Stata/SE, Stata/MP, AMPL, Gurobi.
This description applies theoretically. At the current time, the arithmetic being used to calculate the adjustments to the license count is incorrect and enables Slurm to start jobs for which no licenses are available. We have, therefore, for select packages, switched to using the Unmetered software capability to prevent that. The main example is Abaqus. We have a limit set by FLEXlm of 375 tokens.
[root@glctld ~]# date
Sat Jun 20 08:46:23 EDT 2020
[root@glctld ~]# scontrol show licenses | grep -A 1 abaqus
LicenseName=abaqus@slurmdb
Total=405 Used=124 Free=281 Remote=yes
The theory of how this should work is that, when a new Slurm cluster is
first started, there is something in one of the Ansible roles that will
look at the variables in /etc/ansible/group_vars/all/licenses, and
for each one, it will issue the sacctmgr command shown above to
initialize the upper limit, the currently used, and the available
licenses for each package.
Once it is started, each Slurm controller node will run from cron
/etc/cron.d/create_gres (the name is a holdover from Torque), which
contains
#Ansible: create_gres
*/5 * * * * root /opt/slurm/licenses/bin/create_gres
and the create_gres script creates the file /opt/slurm/licenses/license.txt, which has lines that look like
GLOBAL UPDATETIME=1592658302 STATE=idle ARES=abaqus:281 CRES=abaqus:375
That shows that the 'configured reservation' for Abaqus licenses is 375, and of those, 281 are available for use, meaning that 375-281= 94 are in use, according the the checked out licenses on the license server.
Another script runs from /etc/cron.d/license_update, which contains
(and should probably be run at a slightly different time than
create_gres),
#Ansible: license_update
*/5 * * * * root /opt/slurm/scripts/license_update.sh
The license_update.sh script attempts to do some magic arithmetic to
adjust the total number of licenses available for jobs to use, the
number that are in use by running jobs, and the total number of licenses
configured. It currently does that incorrectly, as shown above.
To make a change to these, the only option is to use the sacctmgr
command to change the limits. The limits with be adjusted again as soon
as the cron scripts have run.
It would be nice if the script that does the adjustment arithmetic could be made accurate, as that would potentially enable 'better' control of jobs wanting licenses, mainly to account for the case where users run a package from a job that does not have a request for the license from Slurm, and so Slurm under-estimates the number of licenses in use and starts a job that does request a license but for which none are currently available.
I believe that there are three cases that the license_update.sh script must handle correctly.
- The number that Slurm thinks are in use exactly matches what the
license.txt file says.
\ - The number that Slurm thinks are in use is greater than the number that is in the license.txt file.
- The number that Slurm thinks are in use is less than the number that is in the license.txt file.
That is work that remains to be done. It is not clear who should do that work, nor what the exact specifications of the work are.
Abaqus allows users to create and use an 'explicit' solver. To start one, the user must checkout Abaqus tokens, however, once they are started, the abaqus process runs a python script, which then spins off external processes, and the license server stops holding those licenses.
\
So, for example, the user sylinae has
sylinae 150652 150635 0 00:35 ? 00:00:00 /bin/bash /sw/arc/centos7/abaqus/2018/Commands/abaqus job=ISRAEL_L2_6X4_24PLY_25J_24S_3DPLA_sigIIC55_sigIIC72_GIIC0.607_noIni23_fiberBogdanor_easyFix input=ISRAEL_L2_6X4_24PLY_25J_24S_3DPLA_sigIIC55_sigIIC72_GIIC0.607_noIni23_fiberBogdanor interactive user=/nfs/turbo/sylinae1/SHIYAO/ClassicalPla/easyFix/vusca3d_shiyao.for scratch=/nfs/turbo/sylinae1/SHIYAO/ClassicalPla/easyFix double=both
there are 72 processes running on two nodes, his job requested 30 Abaqus
tokens, but because it has gone to the explicit_dp solver, those
licenses were returned to Abaqus. They may be needed again, once the
explicit_solver completes and control returns to the abaqus process
shown above.
Because there is a job running that requested from Slurm 30 licenses, but those 30 are not shown as in use by FLEXlm, the license total is inflated by that amount.
The license_update.sh script appears to be assuming that a job will
start the software, check out the license from FLEXlm, and when the
license is returned, the job will conclude. It does not seem to handle
the situation in which a job checks out and returns a license during its
run but may still need those licenses later in the run.