Running simulations on the Purdue Anvil - chunshen1987/iEBE-MUSIC GitHub Wiki

This page explains the setup of the code package on the Anvil Cluster at Purdue University.

The Anvil Cluster provides a $PROJECT folder for our research group. The first-time setup process can be done once for all users by the administrator. Therefore, we will not list details here for practical users.

Running Simulations

To generate and run a batch of simulations, you need to start from the group $PROJECT directory,

cd $PROJECT/iEBE-MUSIC/
./generate_singularity_jobs.py -w $SCRATCH/[runDirectory] -c Anvil --node_type wholenode -n 32 -n_hydro 1 -n_th 4 -par [parameterFile.py] -singularity $PROJECT/singularity_repos/iebe-music_dev.sif -b [bayesParamFile]

Here [runDirectory] needs to be replaced with the information collision system. The file [parameterFile.py] needs to be replaced by the actual parameter file. You can find example parameter files in the config/ folder. All the model parameters are listed in the config/parameters_dict_master.py file. The training parameters for the Bayesian emulator can be specified using the -b option with a parameter file [bayesParamFile].

The option -n specifies the number of jobs to run. On Anvil, we recommend setting the number of hydro events to simulation per job -n_hydro to 1 so that we can set the shortest walltime for each batch of jobs. With -n_hydro 1, the option -n effectively sets the number of total events one wants to simulate in the batch.

On Anvil, the option --node_type has the available options, wholenode, wide, and shared. All nodes contain 128 AMD CPUs. We recommend using wholenode with setting the number of openMP threads -n_th 4.

The full help message can be viewed by ./generate_singularity_jobs.py -h,

usage: generate_singularity_jobs.py [-h] [-w] [-c] [--node_type] [-n]
                                    [-n_hydro] [-n_th] [-par] [-singularity]
                                    [-exe] [-b] [-seed]

⚛ Welcome to iEBE-MUSIC package

  -h, --help            show this help message and exit
  -w , --working_folder_name 
                        working folder path (default: playground)
  -c , --cluster_name   name of the cluster (default: local)
  --node_type           name of the queue (work on stampede2) (default: SKX)
  -n , --n_jobs         number of jobs (default: 1)
  -n_hydro , --n_hydro_per_job 
                        number of hydro events per job to run (default: 1)
  -n_th , --n_threads   number of threads used for each job (default: 1)
  -par , --par_dict     user-defined parameter dictionary file (default:
                        parameters_dict_user.py)
  -singularity , --singularity 
                        path of the singularity image (default: iebe-
                        music_latest.sif)
  -exe , --executeScript 
                        job running script (default:
                        Cluster_supports/WSUgrid/run_singularity.sh)
  -b , --bayes_file     parameters from bayesian analysis (default: )
  -seed , --random_seed Random Seed (-1: according to system time) (default: -1)

After running this script, the working directory $SCRATCH/[runDirectory] will be created. You can then submit the job by

cd $SCRATCH/[runDirectory]
sbatch submit_MPI_jobs.script

Make sure you are on the login node to submit jobs.

While the jobs are running, you can type squeue -u $USER to check the progress.

After the simulations finish, the final results will be automatically copied from the $SCRATCH/[runDirectory] to $PROJECT/RESULTS/[runDirectory].