Model Run and Parameters - openmpp/openmpp.github.io GitHub Wiki
Model run (execution of the model) consists of the following steps:
- initializing of model process(es) with model run options
- connecting to database and creating "model run" with
run_id
andrun_name
- find set of input parameters and prepare it for the run
- reading model input parameters
- simulation of sub-values
- writing output sub-values to output tables in database
- aggregating sub-values using Output Expressions
Results of model run stored in database within unique integer "run_id" and include all model parameters, options and output result tables. You always can find full set of model input and output by run id.
OpenM++ models can be run on Windows and Linux platforms, on single desktop computer, on multiple computers over network, in HPC cluster or cloud environment (Google Cloud, Microsoft Azure, Amazon,...). Because openM++ runtime library hides all that complexity from the model we can safely assume model is a single executable on local machine. Please check Model Run: How to Run the Model for more details.
Following terms: "simulation member", "replica", "sub-sample" are often used in micro-simulation conversations interchangeably, depending on context. To avoid terminology discussion openM++ uses "sub-value" as equivalent of all above and some older pages of our wiki may contain "sub-sample" in that case.
There are two kind of model output tables:
- accumulators table: output sub-values (similar to Modgen sub-samples)
- expressions table: model output value calculated as accumulators aggregated across sub-values (e.g.
mean
orCV
orSE
)
All output accumulator tables always contain same number of sub-values, for example model run:
model.exe -OpenM.SubValues 16
will create 16 sub-values for each accumulator in each output accumulator table.
OpenM++ parameters can also contain sub-values. Parameters sub-values are not required, it is a user choice to run the model and supply sub-values for some parameters.
For example, if user wants to describe statistical uncertanty of parameter SalaryByYearByProvince
then csv file with 16 sub-values can be supplied to run the model:
model.exe -OpenM.SubValues 16 SubFrom.SalaryByYearByProvince csv -OpenM.ParamDir C:\MyCsv\
Note: To simplify diagram below we do omit sub-values from the picture.
But in real database there are multiple sub-values for parameters and accumulators; each sub-value identified by sub_id
column.
Model search for input parameter values in following order:
- use parameter value specified as command line argument
- use parameter value specified inside of ini-file
[Parameter]
section - use parameter value from profile_option table
- read parameter.csv or .tsv file from "OpenM.ParamDir" directory
- import parameter value from other model parameter or other model output table
- use parameter value set of input parameters in database: workset
- use same value as in previous model run: values from "base" run
- use parameter value from default set of input parameters in database: default workset
- some parameters, e.g. number of sub-values may have default value
In any case all input parameters are copied under new run id before simulation starts. That process of copy parameters do guarantee a full copy of input parameters for each model run in database.
There are many options which control model run, i.e.: number of sub-values, number of threads, etc. OpenM++ model gets run options in following order:
- as command line arguments
- from model run options ini-file
- from database
profile_option
tables - use default values
Each option has unique key associated with it, e.g. "Parameter.RandomSeed" is model input parameter "RandomSeed", which is most likely, random generator starting seed. You can use this key to specify model parameter on command line, in ini-file or database. For example:
modelOne.exe -Parameter.RandomSeed 123 -ini my.ini
would run modelOne
model with random seed = 123 and other options from my.ini
file.
Please see OpenM++ Model Run Options to find out more.
Database can contain multiple versions of model input parameter value. User can edit (change values of) input parameter(s) and save it as "working set of model input parameters" (a.k.a. "workset" or scenario).
- each set of parameters has unique "set id" and unique "set name"
- each model must have at least one full set of input parameters populated with default values (default set)
- default input set is a first set of model parameters (first means set with minimal set id)
Most of the model parameters are not changing between simulations and only few are varying. It is convenient to select all unchanged parameters from previous model run ("base" run). In order to do that user can:
- specify "base" model run to re-use parameters values
- create input set of parameters as "based on previous model run" and include only updated parameters in that input set Model will use parameters values from command line, .csv or .tsv files, etc. (as described above) and:
- if input set (workset) specified then select all parameters which do exist in that workset
- if "base" model run specified then select the rest parameters values from that previous model run
- if there is no "base" run then select model parameters from model default workset
If user run the model without any arguments:
modelOne.exe
then input parameters selected from default set, which is the first input data set of that model.
To run the model with input data other than default user can specify set id or workset name:
modelOne.exe -OpenM.SetId 20
modelOne.exe -OpenM.SetName "My Set of Input Parameters"
assuming workset with set_id = 20
and set with name My Set of Input Parameters
exists in model database.
It is often convenient to re-use parameters from previous model run:
model.exe -Parameter.Ratio 0.7 -OpenM.BaseRunId 42
As result model will be using same parameters values as it was for run with run_id = 42
except of parameter Ratio = 0.7
.
For more details please see below: How to specify model base run.
It is also possible to specify value of any scalar parameter as command line argument, i.e.:
model.exe -Parameter.Ratio 0.7
There is an example of such technique at Run model from R: simple loop over model parameter page, where we using NewCaseBased model to study effect of Mortality Hazard input parameter on Duration of Life output:
for (mortalityValue from 0.014 to 0.109 by step 0.005)
{
# run the model
NewCaseBased.exe -Parameter.MortalityHazard mortalityValue
}
If we want to run the model with N sub-values (a.k.a. sub-samples) and want Grade
parameter sub-values to be created as [0,...,N-1] then:
model.exe -OpenM.SubValues 10 -SubFrom.Grade iota
as result sub-values of parameter Grade
would be: [0, ..., 9]
Also any scalar parameter can be defined in model ini-file, i.e.:
model.exe -ini my.ini
; inside of my.ini file:
;
[Parameter]
Z_Parameter = B ; string parameter
SomeInt = 1234 ; integer parameter
OrLogical = true ; boolean parameter
Anumber = 9.876e5 ; float parameter
Another way to supply value of scalar parameter(s) is through profile_option
database table. For example:
model.exe -OpenM.SetId 20 -OpenM.Profile MyProfile
SELECT * FROM profile_lst;
profile_name
------------
MyProfile
SELECT * FROM profile_option;
profile_name option_key option_value
------------- ---------------------- ------------
MyProfile Parameter.RandomSeed 4095
It is also possible to supply some (or even all) model parameters as csv or tsv file(s). For example:
model.exe -OpenM.ParamDir C:\my_csv
If directory C:\my_csv\
exist and contains ParameterName.csv
or ParameterName.tsv
file model will use it parameter values.
Parameter directory can be specified as command-line argument or as ini-file entry.
On picture above model run as:
model.exe -ini my.ini -OpenM.SetId 20
and my.ini file contains:
[OpenM]
ParamDir = C:\my_csv\
As result model.exe
will read from C:\my_csv\Sex.csv
values of "Sex" parameter:
sub_id,dim0,param_value
0, F, true
0, M, false
Together with csv files you can also supply parameter value note file(s) to describe scenario data values in each model language.
Parameter value note files must be located in the same csv directory are named as: ParameterName.LANG-CODE.md
.
For example, C:\my_csv\Sex.EN.md
is an English notes for Sex
parameter values:
Sex parameter values in this scenario contain indicators of increased gender-specific hazards.
It is also possible to use TSV (tab separated values) files instead of CSV files, for example Sex.tsv parameter values:
sub_id dim0 param_value
0 F true
0 M false
Or even have enum id's in CSV or TSV files instead of codes, for example C:\my_csv\Sex.id.csv
can be:
sub_id,dim0,param_value
0, 0, true
0, 1, false
Model will automatically detect file format by extension:
- ParameterName.csv: CSV file with enum codes in dimension columns
- ParameterName.tsv: TSV file with enum codes in dimension columns
- ParameterName.id.csv: CSV file with enum id's in dimension columns
- ParameterName.id.tsv: TSV file with enum id's in dimension columns
You can use OpenM.IdCsv true
model run option if your CSV or TSV files do not have .id.csv extension:
model.exe -OpenM.SetId 20 OpenM.IdCsv true
In that case model will assume parameter.csv file contains enum id's instead of enum codes.
Format of parameter.csv is based on RFC 4180 with some simplification:
- space-only lines silently ignored
- end of line can be CRLF or LF
- values are trimmed unless they are
" double quoted "
- multi-line string values not supported
If parameter is boolean then following values expected (not case sensitive):
- "true" or "t" or "1"
- "false" or "f" or "0"
Important: Header line must include all dimension names, in ascending order, without spaces, e.g.: sub_id,dim0,dim1,param_value
,
or, in case of TSV file: sub_id dim0 dim1 param_value
header fields should be separated by TAB.
Parameter.csv file must contain all values, e.g. if parameter has 123456 values then csv must have all 123456 lines + header. Sorting order of lines are not important.
If user want to supply up to 32 sub-values of "Sex" parameter then Sex.csv file look like:
sub_id,dim0,param_value
0, F, true
0, M, false
1, F, true
1, M, true
.................
31, F, false
31, M, true
Important: Presence of multiple sub-values in csv file (or in database) does not mean model will be using all parameter sub-values. Only explicitly specified parameter(s) receiving sub-values.
For example, if user run the model 8 times:
model.exe -OpenM.SubValues 8
model.exe -OpenM.SubValues 8 -OpenM.ParamDir C:\my_csv
model.exe -OpenM.SubValues 8 -OpenM.ParamDir C:\my_csv -SubFrom.Sex csv -SubValues.Sex default
model.exe -OpenM.SubValues 8 -OpenM.ParamDir C:\my_csv -SubFrom.Sex csv -SubValues.Sex 17
model.exe -OpenM.SubValues 8 -OpenM.ParamDir C:\my_csv -SubFrom.Sex csv
model.exe -OpenM.SubValues 8 -OpenM.ParamDir C:\my_csv -SubFrom.Sex csv -SubValues.Sex [24,31]
model.exe -OpenM.SubValues 8 -OpenM.ParamDir C:\my_csv -SubFrom.Sex csv -SubValues.Sex 1,3,5,7,9,11,13,15
model.exe -OpenM.SubValues 8 -OpenM.ParamDir C:\my_csv -SubFrom.Sex csv -SubValues.Sex xAAAA
model.exe -OpenM.SubValues 8 -OpenM.ParamDir C:\my_csv -SubFrom.GeoGroup csv -SubValues.GeoGroup 1,3,5,7,9,11,13,15
- "Sex" parameter expected to be in database and no sub-values used
- "Sex" parameter value is selected as "default" (sub_id=0) from
C:\my_csv\Sex.csv
, if .csv file exist - "Sex" parameter value is selected as "default" (sub_id=0) from
C:\my_csv\Sex.csv
, .csv file must exist - "Sex" parameter value is selected as sub_id = 17 from
C:\my_csv\Sex.csv
- "Sex" parameter using sub-values [0,7] from
C:\my_csv\Sex.csv
- "Sex" parameter using sub-values [24,31] from
C:\my_csv\Sex.csv
- "Sex" parameter using sub-values 1,3,5,7,9,11,13,15 from
C:\my_csv\Sex.csv
- "Sex" parameter using sub-values 1,3,5,7,9,11,13,15 from
C:\my_csv\Sex.csv
(bit mask) - all parameters of GeoGroup using sub-values 1,3,5,7,9,11,13,15 from .csv files form
C:\my_csv\
directory
"Default" sub-value id can be explicitly defined for input parameter by person who published input set of parameters (workset). If "default" sub_id is not defined for that parameter then sub_id=0 assumed. Sub-value id's in the input set of parameters (in workset) can have be any integer (can be negative and not even have to sequential). For example if RatioByProvince parameter have 32 sub-values then typically sub_id's are [0,31], but it can be [-10, -8, -6, -4, -2, 0, 2, 4, ..., 52] and default sub_id can be = -10.
Important: Number of sub-values in csv must be at least as user required.
In example above Sex.csv
contains 32 sub-values and user cannot run model with more than 32 sub-values.
If input parameter specified as "importable" by model developer then value(s) can be imported from run values of upstream model parameter or output table.
For example if model developer of BigModel
specified:
import Phi (RedModel.RedPhi) sample_dimension= off;
import Zet (SunModel.SunZet) sample_dimension= off;
And model user running BigModel
as:
BigModel.exe -Import.All true
Then:
- value of
BigModel
parameterPhi
must be imported from last run ofRedModel
parameterRedPhi
- value of
BigModel
parameterZet
must be imported from last run ofSunModel
output tableSunZet
There are multiple options to control model import. For example if user run BigModel
9 times:
BigModel.exe -Import.All true
BigModel.exe -Import.SunModel true
BigModel.exe -ImportRunDigest.SunModel abcdefghef12345678
BigModel.exe -ImportRunId.SunModel 123
BigModel.exe -ImportRunName.SunModel GoodRun
BigModel.exe -ImportDigest.SunModel 87654321fedcba
BigModel.exe -ImportId.SunModel 456
BigModel.exe -ImportExpr.SunZet expr4
BigModel.exe -ImportDatabase.SunModel "Database=../NewSunModel.sqlite;OpenMode=ReadOnly;"
- Import all importable parameters from last successful run of upstream models
- Import all parameters importable from
SunModel
using values of last successful run ofSunModel
- Import all parameters importable from
SunModel
using values of run where digest =abcdefghef12345678
- Import all parameters importable from
SunModel
using values of run where id = 123 - Import all parameters importable from
SunModel
using values of last successful run where run name =GoodRun
- Import all parameters importable from
SunModel
where model digest is87654321fedcba
using values of last successful run - Import all parameters importable from
SunModel
where model id = 456 using values of last successful run - Import parameter
Zet
fromSunModel
output tableSunZet
expressionexpr4
using values of last successful run - Import all parameters importable from
SunModel
from database../NewSunModel.sqlite
Import options can be combined with sub-values options if model user want to select specific sub-values from upstream model parameter.
Default database to search for upstream model:
- if upstream model
SunModel
exist in current model database then it is imported from current database - else it must be default upstream model SQLite database:
SunModel.sqlite
Most of the model parameters are not changing between simulations and only few parameters are varying. In that case it is convenient to select unchanged parameters from previous model run ("base" run).
Base run can be identified by run_id
or run digest or run name.
Please note: model run names are not unique and if there are multiple runs in database with the same name then first run selected:
SELECT MIN(run_id) WHERE run_name = 'Default model run';
Input set of model parameters (workset) can be created as "based on existing run" and store only small number of model parameters,
all the rest will be selected selected from "base" run by run_id
.
On picture above command line to run the model is:
model.exe -ini my.ini -OpenM.SetId 20
and input set with id 20 defined as "based on run" with id = 11:
SELECT set_id, set_name, base_run_id FROM workset_lst WHERE set_id = 20;
set_id set_name base_run_id
------ ------------------- -----------
20 set_based_on_run_11 11
Because workset with id = 20 does not include "Provinces" input parameter those values selected from existing model run by run_id = 11
:
SELECT dim0, param_value FROM Provinces WHERE run_id = 11;
dim0 value
---- -----
0 ON
1 QC
Note: sql above specially simplified, actual database table names, column names and queries bit more complex.
It is possible to explicitly specify model base run to select input parameters. For example:
model.exe -Parameter.Ratio 0.7 -OpenM.SetName "Age Input Values" -OpenM.BaseRunId 42
Model will use parameter Ratio = 0.7
and select all parameters which do exist in Age Input Values
workset:
SELECT dim0, param_value FROM Age WHERE set_name = 'Age Input Values';
dim0 value
---- -----
0 [0,21]
1 22+
.... select all other parameters where parameter exist in 'Age Input Values' ....
And the rest of model parameters selected from base run:
SELECT dim0, param_value FROM Provinces WHERE run_id = 42;
dim0 value
---- -----
0 BC
1 NS
It is also possible to use run diegst or run name to identify "base" model run:
model.exe -Parameter.Ratio 0.7 -OpenM.BaseRunDigest 5dc848891ea57db19d8dc08ec7a30804
model.exe -Parameter.Ratio 0.7 -OpenM.BaseRunName "My base run of the Model"
Please keep in mind, model run may not be unique and if database contains multiple model runs with the same name then first run will be selected.
If we want to run the model with multiple sub-values (a.k.a. sub-samples) and want "RatioByProvince" parameter sub-values selected from database:
model.exe -OpenM.SubValues 8 -SubFrom.RatioByProvince db
Model will select "RatioByProvince" parameter sub-values from default workset or from base run, if there are no RatioByProvince parameter in default workset. Database must contain at least 8 sub-values for "RatioByProvince".
model.exe -OpenM.SubValues 8 -SubFrom.GeoGroup db
For GeoGroup of parameters model will select sub-values from default workset or from base run, if there are no such parameter in default workset. Database must contain at least 8 sub-values for all parameters of GeoGroup.
For example:
SELECT sub_id, dim0, param_value FROM RatioByProvince WHERE run_id = 11;
sub_id dim0 value
------ ---- -----
0 0 1.00
0 1 1.01
1 0 1.02
1 1 1.03
2 0 1.04
2 1 1.05
............
31 0 1.31
31 1 1.32
In that case first 8 sub-values will be selected with sub_id
between 0 and 7.
There are multiple options to specify which sub-values to select from database, for example:
model.exe -OpenM.SubValues 8
model.exe -OpenM.SubValues 8 -SubFrom.RatioByProvince db
model.exe -OpenM.SubValues 8 -SubFrom.RatioByProvince db -SubValues.Sex [24,31]
model.exe -OpenM.SubValues 8 -SubFrom.RatioByProvince db -SubValues.Sex 1,3,5,7,9,11,13,15
model.exe -OpenM.SubValues 8 -SubFrom.RatioByProvince db -SubValues.Sex xAAAA
model.exe -OpenM.SubValues 8 -SubFrom.RatioByProvince db -SubValues.Sex default
model.exe -OpenM.SubValues 8 -SubFrom.RatioByProvince db -SubValues.Sex 17
model.exe -OpenM.SubValues 8 -SubFrom.GeoGroup db -SubValues.GeoGroup 17
- "RatioByProvince" parameter expected to be in database and no sub-values used
- "RatioByProvince" parameter using sub-values [0,7] from database
- "RatioByProvince" parameter using sub-values [24,31] from database
- "RatioByProvince" parameter using sub-values 1,3,5,7,9,11,13,15 from database
- "RatioByProvince" parameter using sub-values 1,3,5,7,9,11,13,15 from database (bit mask)
- "RatioByProvince" parameter value is selected as "default" (sub_id=0) from database
- "RatioByProvince" parameter value is selected as sub_id = 17 from database
- all parameters of GeoGroup are selected as sub_id = 17 from database
"Default" sub-value id can be explicitly defined for input parameter by person who published input set of parameters (workset). If "default" sub_id is not defined for that parameter then sub_id=0 assumed. Sub-value id's in the input set of parameters (in workset) can have be any integer (can be negative and not even have to sequential). For example if RatioByProvince parameter have 32 sub-values then typically sub_id's are [0,31], but it can be [-10, -8, -6, -4, -2, 0, 2, 4, ..., 52] and default sub_id can be = -10.
On the other hand, in model run results sub_id is always [0,N-1] for run parameters and output tables. For example:
model.exe -OpenM.SubValues 8 -SubFrom.RatioByProvince db -SubValues.Sex [24,31]
"RatioByProvince" parameter in model run will have sub_id column values: [0,7].