slurm and software licenses - raeker/ARC-Wiki-Test GitHub Wiki

Slurm and software licenses

Some of the commercial software that is used on the ARCTS clusters has limits on how many instances can be run at one time and by whom.  Most of those packages use the FLEXlm license manager, which runs on flux-license1.it.miserver.umich.edu with some cnames defined.  If there are n licenses, then the n+1th person will get an error saying a license could not be checked out.  Since that number is known ahead of time, Slurm provides a capability to set that limit and track jobs that request licenses so that the job that would request the n+1th license will be held until a running job with a license finishes and a license will be again available.

We also have some software that we either contractually with the company or with ITS agree to limit to n concurrent uses.  We will call the former Metered usage software and the latter Unmetered usage software.  Software limits are imposed per cluster.  So, separate limits are maintained for each of GL, A2, and LH.  The majority of nodes in LH have opted out of commercial software use, so virtually no licenses are defined there.  That is an issue of some contention within ARCTS and may require some negotiation with COE about how to handle those licenses.

This page attempts to explain how the limits get set for a cluster, how changes get made, and who does what.

Unmetered usage software

This is the simpler case and is a proper subset of Metered usage software, so we will look at it first.  As an example, we will use SAS.  ARCTS pays ITS for 20 SAS licenses per year, and we allocate half to GL and half to A2.

We tell users to request SAS licenses, and we tell Slurm how many are available for each cluster.  This is done in their submission script with

#SBATCH --licenses=sas@slurmdb:1

The usage limit for this software is set once in a file, and the limit does not change.  We trust Slurm will track how many licenses are in use, how many are waiting, and how many have been used in prior jobs (perhaps; this is currently speculation).  At this time, we do not know how those initial counts are actually added to Slurm, but we trust the Systems group to get it right.  The limits are currently set in Ansible, in the file $ANSIBLE_ROOT/group_vars/all/licenses.  Once Slurm starts, the limits are stored in its database and are modified with an sacctmgr -i modify resource name=$name set count=$total command, where $name is the license name as registered with Slurm and $total is the new upper-limit.  To display a limit, use

$ scontrol show licenses | grep -A1 $name
LicenseName=sas@slurmdb
Total=10 Used=0 Free=10 Remote=yes

To change a limit, the software team should submit a ticket using TeamDynamix to the Systems group requesting that the upper limit be modified in the $ANSIBLE_ROOT/group_vars/all/licenses file and that an sacctmgr command be issued to adjust the limit in the running Slurm.  Once the change has been made, the above command should be used to verify the change to the active Slurm, and the requester should verify that the new limit has been entered into the /etc/ansible/group_vars/all/licenses file in the Git master branch on flux-admin09.

Examples of Unmetered usage software are:  SAS, Stata/SE, Stata/MP, AMPL, Gurobi.

Metered usage software

This description applies theoretically.  At the current time, the arithmetic being used to calculate the adjustments to the license count is incorrect and enables Slurm to start jobs for which no licenses are available.  We have, therefore, for select packages, switched to using the Unmetered software capability to prevent that.  The main example is Abaqus.  We have a limit set by FLEXlm of 375 tokens. 

[root@glctld ~]# date
Sat Jun 20 08:46:23 EDT 2020

[root@glctld ~]# scontrol show licenses | grep -A 1 abaqus
LicenseName=abaqus@slurmdb
Total=405 Used=124 Free=281 Remote=yes

The theory of how this should work is that, when a new Slurm cluster is first started, there is something in one of the Ansible roles that will look at the variables in /etc/ansible/group_vars/all/licenses,  and for each one, it will issue the sacctmgr command shown above to initialize the upper limit, the currently used, and the available licenses for each package.

Once it is started, each Slurm controller node will run from cron /etc/cron.d/create_gres (the name is a holdover from Torque), which contains

#Ansible: create_gres
*/5 * * * * root /opt/slurm/licenses/bin/create_gres

and the create_gres script creates the file /opt/slurm/licenses/license.txt, which has lines that look like

GLOBAL UPDATETIME=1592658302 STATE=idle ARES=abaqus:281 CRES=abaqus:375

That shows that the 'configured reservation' for Abaqus licenses is 375, and of those, 281 are available for use, meaning that 375-281= 94 are in use, according the the checked out licenses on the license server.

Another script runs from /etc/cron.d/license_update, which contains (and should probably be run at a slightly different time than create_gres),

#Ansible: license_update
*/5 * * * * root /opt/slurm/scripts/license_update.sh

The license_update.sh script attempts to do some magic arithmetic to adjust the total number of licenses available for jobs to use, the number that are in use by running jobs, and the total number of licenses configured.  It currently does that incorrectly, as shown above.

To make a change to these, the only option is to use the sacctmgr command to change the limits.  The limits with be adjusted again as soon as the cron scripts have run.

It would be nice if the script that does the adjustment arithmetic could be made accurate, as that would potentially enable 'better' control of jobs wanting licenses, mainly to account for the case where users run a package from a job that does not have a request for the license from Slurm, and so Slurm under-estimates the number of licenses in use and starts a job that does request a license but for which none are currently available.

I believe that there are three cases that the license_update.sh script must handle correctly.

  1. The number that Slurm thinks are in use exactly matches what the license.txt file says.
    \
  2. The number that Slurm thinks are in use is greater than the number that is in the license.txt file.
  3. The number that Slurm thinks are in use is less than the number that is in the license.txt file.

That is work that remains to be done.  It is not clear who should do that work, nor what the exact specifications of the work are.

Note about Abaqus in particular

Abaqus allows users to create and use an 'explicit' solver.  To start one, the user must checkout Abaqus tokens, however, once they are started, the abaqus process runs a python script, which then spins off external processes, and the license server stops holding those licenses.

\

So, for example, the user sylinae has

sylinae 150652 150635 0 00:35 ? 00:00:00 /bin/bash /sw/arc/centos7/abaqus/2018/Commands/abaqus job=ISRAEL_L2_6X4_24PLY_25J_24S_3DPLA_sigIIC55_sigIIC72_GIIC0.607_noIni23_fiberBogdanor_easyFix input=ISRAEL_L2_6X4_24PLY_25J_24S_3DPLA_sigIIC55_sigIIC72_GIIC0.607_noIni23_fiberBogdanor interactive user=/nfs/turbo/sylinae1/SHIYAO/ClassicalPla/easyFix/vusca3d_shiyao.for scratch=/nfs/turbo/sylinae1/SHIYAO/ClassicalPla/easyFix double=both

there are 72 processes running on two nodes, his job requested 30 Abaqus tokens, but because it has gone to the explicit_dp solver, those licenses were returned to Abaqus.  They may be needed again, once the explicit_solver completes and control returns to the abaqus process shown above.

Because there is a job running that requested from Slurm 30 licenses, but those 30 are not shown as in use by FLEXlm, the license total is inflated by that amount.

The license_update.sh script appears to be assuming that a job will start the software, check out the license from FLEXlm, and when the license is returned, the job will conclude.  It does not seem to handle the situation in which a job checks out and returns a license during its run but may still need those licenses later in the run.

⚠️ **GitHub.com Fallback** ⚠️