Submit jobs to the grid - E1039-Collaboration/e1039-wiki GitHub Wiki
How Grid Works
Page 9 of docDB#5509
Environment Setup
- Logon to a submitter node, e.g.
spinquestgpvm01.fnal.gov
orgpvm02
. On other computers the following steps won't work. - Source the setup script as follows. It sets up a bunch of shell variables and commands for job submission. There are many similar scripts for this purpose, but this one will be our officially-maintained version.
source /exp/seaquest/app/software/script/setup-jobsub-spinquest.sh
- You can add the following line to your ".bashrc" in order to auto-source the setup script.
test ${HOSTNAME:0:13} = 'spinquestgpvm' && source /exp/seaquest/app/software/script/setup-jobsub-spinquest.sh
- You can source "setup-jobsub-seaquest.sh" in the identical directory to select the SeaQuest jobsub environment.
Basic Procedure
You could learn the Grid usage by trying the following procedure.
- Set up the environment as described above.
- Get a Kerberos ticket
kinit <type your pwd>
- Modify a job-submission script based on your needs.
- An example is e1039-analysis/SimChainDev/gridsub.sh.
- Scripts with the same name under
e1039-analysis
work similarly.
- Test run the job locally
./gridsub.sh test-10k 0 2 10000
- Test run small batch tests
./gridsub.sh test-10k 1 2 10000
- You will be asked to copy and paste an authentication URL to your web browser.
- You can run
jobsub_q --user=$USER
(or its aliasjobsub_q_mine
) to check the status of your jobs.
- Run your jobs
- You adjust the command-line arguments of
gridsub.sh
as you need. - You can use FIFE monitor to check the job status.
- You adjust the command-line arguments of
Write Permission
- Normal E1039 user can
- Write output files only under
/pnfs/e1039/scratch/users
and - Read all files under
/pnfs/e1039
and/pnfs/e906
.
- Write output files only under
- Only granted users can write output files to any directory under
/pnfs/e1039
.- Mainly for data production.
- The granted users (i.e. assigned the
Production
role) can be found at this FIfemon page. If the link does not work, you can manually- Visit Fifemon,
- Search for
User Info
to open that dashboard, and - Set
Experiments
tospinquest
andRole
toProduction
.
Notes
- The back-end system of the job submission at Fermilab was changed to
jobsub_lite
in Feb. 2023. You can find details in DocDB 10460.
Tips
How to exclude bad OSG nodes
Use the "--append_condor_requirements" option of "jobsub_submit" as follows:
cmd="$cmd --append_condor_requirements='(TARGET.GLIDEIN_Site isnt \"UCSD\")'"
It is indeed used by default in "e1039-analysis/SimChainDev/gridsub.sh". Valid site names are listed in this Wiki page. Note that the "--blacklist" option has a known defect according to the Fermilab Service Desk as of August 2019.