ARC clusters for python - ai-se/admin GitHub Wiki
This article shows my practice on ARC system. Contact [email protected]
Account Preparation
- Ask for the access see https://arcb.csc.ncsu.edu/~mueller/cluster/arc/
- Cisco VPN is required when you are out of campus. https://oit.ncsu.edu/campus-it/campus-data-network/vpn/
- Download the anaconda/miniconda install shell from https://www.anaconda.com/distribution/#download-section OR https://docs.conda.io/en/latest/miniconda.html
scp xx.sh [email protected]:/home/UNITIY_ID/
- After the login, install python package (for python programs)
srun --pty /bin/bash # get 16 cores (1 node) in interactive mode
sh xxxcondaxxx.sh # see following~
...
Do you accept the license terms? [yes|no]
[no] >>> yes
Miniconda3 will now be installed into this location:
/home/jchen37/miniconda3
- Press ENTER to confirm the location
- Press CTRL-C to abort the installation
- Or specify a different location below
[/home/jchen37/miniconda3] >>> /home/jchen37/python3
- Test python
python3/bin/python3
, should see
Python 3.6.4 |Anaconda, Inc.| (default, Jan 16 2018, 18:10:19)
[GCC 7.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
The python program
You can let your python program executable as
python main.py -alg WORTHY -model ss -r 10
import sys
import os
path = os.getcwd()
rootpath = path[:path.rfind('FSSE') + 4] # FSSE is the folder name of your program
sys.path.append(rootpath)
if __name__ == '__main__':
# Parsing the sys.argv. You can custom the parameter names, etc.
# For the convenient of debugging, you can have default parameters.
alg = 'WORTHY'
model_id = 0
repeat = 1
for i, v in enumerate(sys.argv):
if v == '-alg':
alg = sys.argv[i + 1].upper()
if v == '-model':
model_id = int(sys.argv[i + 1])
if v == '-r':
repeat = int(sys.argv[i + 1])
... # rest of the program
# writing out the results
with open(f'{rootpath}/results/{model.name}.{alg}.res', 'a+') as f:
f.write('##\n')
...
sys.exit(0)
Deployment on ARC
- Copy program from local machine to ARC
scp -r xxx [email protected]:/home/unity_id
- Make sure all required packaged is install on arc by
home/unity_id/python3/bin/pip install xxx
mkdir arc
cd arc
mkdir out err
- On the folder
arc
, create a batch fileyyy.batch
as
#!/bin/bash
#
#SBATCH --job-name=run_WORT
#SBATCH --ntasks=1
#SBATCH --time=01:30:00
#SBATCH --error=err/%j.err
#SBATCH --output=out/%j.out
/home/unity_id/python3/bin/python3 /path/to/main.py -alg worthy -model $mid -r 10
Note: the $mid will be set up outside
- Create the
ignition.sh
for mid in {0..6}; do export mid; sbatch yyy.batch; done
for mid in {0..6}; do export mid; sbatch yyy.batch; done
In this example, we will see algorithm worthy
executed in model 0-6 for 20 repeats (each model, 10 repeats for one task)
- Run the code by
sh ignition.sh
- Monitoring
squeue
- Cancel
scancel ###JOB_ID
- Monitoring
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
117339 normal run_WORT jchen37 R 5:25 1 c99
117340 normal run_WORT jchen37 R 5:25 1 c106
117341 normal run_WORT jchen37 R 5:25 1 c80
117342 normal run_WORT jchen37 R 5:25 1 c81