Model Run HowTo - openmpp/openmpp.github.io GitHub Wiki

OpenM++ model run overview

It is recommended to start from single desktop version of openM++.

OpenM++ models can be run on Windows and Linux platforms, on single desktop computer, on multiple computers over network, in HPC cluster or cloud environment (i.e. Google Cloud, Microsoft Azure, Amazon,...).

You need to use cluster version of openM++ to run the model on multiple computers in your network, in cloud or HPC cluster environment. OpenM++ is using MPI to run the models on multiple computers.

By default openM++ model runs with one sub-value and in single thread, which is convenient to debug or study your model. There are following options to run openM++ model:

"default" run: one sub-value and single thread
"desktop" run: multiple sub-values and multiple threads
"restart" run: finish model run after previous failure (i.e. power outage)
"task" run: multiple input sets of data (a.k.a. multiple "scenarios" in Modgen), multiple sub-values and threads
"cluster" run: multiple sub-values, threads and model process instances runs on LAN or cloud (required MPI)
"cluster task" run: same as "cluster" plus multiple input sets of data (required MPI)

Please also check Model Run: How model finds input parameters for more details.

Sub-values: sub-samples, members, replicas

Following terms: "simulation member", "replica", "sub-sample" are often used in micro-simulation conversations interchangeably, depending on context. To avoid terminology discussion openM++ uses "sub-value" as equivalent of all above and some older pages of that wiki may contain "sub-sample" in that case.

Default run: simplest

OpenM++ Model run: Default

If no any options specified to run the model then

all parameters are from default input data set
single thread is used for modeling
only one sub-value calculated

modelOne.exe

It is most simple way to debug your model.

Desktop run: model run on single computer

OpenM++ Model run: Desktop

If only single computer available then

user can specify which set of input data to use (by set name or id)
number of sub-values to calculate
number of modeling threads to use

modelOne.exe -OpenM.SetName modelOne -OpenM.SubValues 16 -OpenM.Threads 4

After model run completed user can repeat it with modified parameter(s) values:

model.exe -Parameter.Ratio 0.7 -OpenM.BaseRunId 7 -OpenM.SubValues 16 -OpenM.Threads 4

Run the model with modified parameter(s)

Command above will run the model with new value for parameter Ratio = 0.7 and use the rest of parameters from previous model run (a.k.a. "base" run). Base run can be identified by run id, which is 7 in example above, by run digest or run name. Please see Model Run: How model finds input parameters for more details.

Restart run: finish model run after previous failure

If previous model run was not completed (i.e. due to power failure or insufficient disk space) you can restart it by specifying run id:

modelOne.exe -OpenM.RestartRunId 11

Task run: multiple sets of input data

OpenM++ Model run: Task

Modeling task consists of multiple sets of input data and can be run in batch mode. For example, it is make sense to create modeling task to Run RiskPaths model from R with 800 sets of input data to study Childlessness by varying

Age baseline for first union formation
Relative risks of union status on first pregnancy

RiskPaths.exe -OpenM.TaskName Childlessness -OpenM.SubValues 8 -OpenM.Threads 4

Run of such modeling task will read 800 input sets with set id [1, 800] and produce 800 model run outputs with run id [801, 1600] respectively.

Dynamic task run: wait for input data

It is possible to append new sets of input data to the task as it runs. That allow you to use some optimization methods rather than simply calculate all possible combinations of input parameters. In that case modeling task does not completed automatically but wait for external "task can be completed" signal. For example:

#
# pseudo script to run RiskPaths and find optimal solution for Childlessness problem
# you can use R or any other tools of your choice
#
# # create Childlessness task
# # run loop until you satisfied with results

RiskPaths.exe -OpenM.TaskName Childlessness -OpenM.TaskWait true

# # find your modeling task run id, i.e.: 1234
# # analyze model output tables
# # if results not optimal
#   # then append new set of input data into task "Childlessness" and continue loop
#   # else signal to RiskPaths model "task can be completed":
#   #   UPDATE task_run_lst SET status = 'p' WHERE task_run_id = 1234;
#
# Done.
#

Cluster run: model run on multiple computers

OpenM++ Model run: Cluster

You use MPI to run the model on multiple computers over network or in cloud or on HPC cluster. For example, to run 4 instances of modelOne.exe with 2 threads each and compute 16 sub-values:

mpiexec -n 4 modelOne.exe -OpenM.Threads 2 -OpenM.SubValues 16

Please notice, usage of "mpiexec -n 4 ...." as above is suitable for test only and you should use your cluster tools for real model run.

Cluster task: run modeling task on multiple computers

Modeling task with 1000x input data sets can take long time to run and it is recommended to use cluster (multiple computers over network) or cloud, such as Google Compute Engine, to do that. For example, RiskPaths task above can be calculated much faster if 200 servers available to run it:

mpiexec -n 200 RiskPaths.exe -OpenM.TaskName Childlessness -OpenM.SubValues 16 -OpenM.Threads 4

Please notice, usage of "mpiexec -n 200 ...." as above is suitable for test only and you should use your cluster tools for real model run.

Dynamic task: you can use -OpenM.TaskWait true argument as described above to dynamically change task as it runs.