Tutorial - JoshuaSBrown/Py_Parallel_Wrapper GitHub Wiki
Tutorial
Running Serial Version
As an example of a tutorial lets run the test script after compiling it
python Parallel_Wrapper.py --run_file test
This code will proceed to run test. The default for a serial run is 1 processor and 1 run. You can see the effects of this run if you watch the memory usage because it leads to a significant spike.
Parallel Run
To run in parallel all we need to do is specify the number of parallel runs we have available
python Parallel_Wrapper.py --run_file test --num_proc 4
This will lead to 4 instances of the test executable. If you are watching the memory you will notice that all 4 jobs do not stop instantaneously but there is a delay between them. This is because we are also monitoring the memory and thus to get an idea of how much memory each processor will use there is a delay before the next instance is launched.
Parallel Run II
Now that we have seen how to run in parallel we can also specify if we have more jobs then parallel processors. For instance if I have 6 jobs I want to run but only 4 processors. In such a case we will always have four jobs and as a job finishes and a processor is freed a new job will be submitted untill all the jobs are completed.
python Parallel_Wrapper.py --run_file test --num_proc 4 --num_runs 6
Parallel Run III
Rarely is it useful to simply run an executable in the same folder because they often generate output files that will overwrite each other, thus we can specify unique folders for each instance to run. This is specified with the --run_dir parameter followed by a string enclosed in speech marks. The key to providing a folder name is to specify where a number will be placed within the name of the folder. The PAR keyword is used to do this for instance if I provide the following code:
python Parallel_Wrapper.py --run_file test --num_proc 4 --num_runs 6 --run_dir DirPARpython Parallel_Wrapper.py --run_file test --num_proc 4 --num_runs 6 --run_dir DirPAR
I will end up with a total of six folders because there are 6 unique jobs, PAR keyword will be replaced with the job id as follows:
- Dir0
- Dir1
- Dir2
- Dir3
- Dir4
- Dir5
Parallel Run IV
It is also worth mentioning that many executables take command line arguments if you want to pass command line options to an executable you can do this. Our example test code takes two arguments to specify folders it needs to create.
python Parallel_Wrapper.py --run_file test --num_proc 4 --num_runs 6 --run_dir DirPAR
--run_args "-d DirPAR/FOLDER"
This will produce the same folder structure as before but with an extra folder in each of the unique directories as such
- Dir0->FOLDER
- Dir1->FOLDER
- Dir2->FOLDER
- Dir3->FOLDER
- Dir4->FOLDER
- Dir5->FOLDER
Notice that the PAR keyword can also be used in the command line arguments and the correct substitution will be made.