external names - openmpp/openmpp.github.io GitHub Wiki
Home > Model Development Topics > External Names
This topic describes how to specify the names exposed to external software
for parameter and table dimensions, table expressions, and enumerations of classifications.
These names are used in csv files produced by dbcopy and in downloads from the OpenM++ UI.
- Default name
- Explicit name
- Identifying missing explicit names
- Heuristic name
- Name restrictions
- All generated names
Each dimension of a parameter or table, each expression of a table, and each enumerator of a classification has an associated external name. The OpenM++ compiler provides a default name for each external name. The default name can be overridden by an explicit name or a heuristic name.
The default name for a parameter dimension has the form ParameterName.DimN,
where N is {0,1,...,rank-1},
and rank is the number of parameter dimensions.
The default name for a table dimension has the form TableName.DimN,
where N is {0,1,...,rank-1},
and rank is the number of classificatory dimensions in the table,
i.e. the number of dimensions not counting the expression 'dimension'.
Because rank excludes the expression dimension of a table,
the expression dimension is skipped over in the numbering of a table dimension default name.
Note: The default name of a table dimension may differ from what's used to identify a dimension in //LABEL and /*NOTE documentation comments.
For compatibility with Modgen models,
documentation comments count the expression dimension of the table in the numbering scheme.
For more on labels and notes, see
symbol labels.
The default name for a table expression has the form TableName.ExprN,
where N is {0,1,...,expressions-1},
and expressions is the number of expressions in the table.
The default name of an enumerator of a classification is the same as the enumerator name in model code.
Here's an example of the default names for a 1-dimensional table:
table Person T02_TotalPopulationByYear  //EN Life table
{
    //EN Curtate age
    integer_age *		
    {
        unit,        //EN Population start of year
        duration()   //EN Average population in year
    }
};This table has a single classificatory dimension with default name Dim0
and two expressions with default names Expr0 and Expr1.
These names identify individual cells in the table.
If a run with this table is exported in .csv format,
an extract of the file T02_totalPopulationByYear.csv might look like this:
| expr_name | Dim0 | expr_value | 
|---|---|---|
| Expr0 | 0 | 5000 | 
| Expr0 | 1 | 5000 | 
| Expr0 | 2 | 5000 | 
| ... | ||
| Expr0 | 99 | 5000 | 
| Expr0 | 100 | 5000 | 
| Expr1 | 0 | 5000 | 
| Expr1 | 1 | 5000 | 
| Expr1 | 2 | 5000 | 
| ... | 
Where Dim0 identifies the cell coordinates and Expr0 and Expr1 identify the expression.
The generated default names Dim0, Expr0, and Expr1 are positional, not descriptive.
That can make downstream use of exported results difficult and error-prone.
Here's an example of the default names of the enumerators of a classification, taken from the RiskPaths model.
The UNION_ORDER classification has the following declaration:
classification UNION_ORDER  //EN Union order
{
    UO_FIRST,    //EN First union 				
    UO_SECOND    //EN Second union 				
};The default names of the two enumerators are the same as the codes in the declaration: UO_FIRST and UO_SECOND.
An explicit name can be assigned to dimension and expressions
in model source code using the naming operator =>,
in which case it replaces the default name.
The following example replaces the default names Dim0, Expr1, and Expr2 with more descriptive names:
table Person T02_TotalPopulationByYear //EN Life table
{
    //EN Curtate age
    age => integer_age *
    {
        pop => unit,     //EN Population start of year
        py => duration() //EN Average population in year
    }
};The table dimension is now named age and the measures are named pop and py.
The .csv file would now look something like:
| expr_name | age | expr_value | 
|---|---|---|
| pop | 0 | 5000 | 
| pop | 1 | 5000 | 
| pop | 2 | 5000 | 
| ... | ||
| pop | 99 | 5000 | 
| pop | 100 | 5000 | 
| py | 0 | 5000 | 
| py | 1 | 5000 | 
| py | 2 | 5000 | 
| ... | 
An explicit name can also be specified with a comment-based syntax using the default name. The following lines have the same effect as the preceding example:
//NAME T02_TotalPopulationByYear.Dim0 age
//NAME T02_TotalPopulationByYear.Expr0 pop
//NAME T02_TotalPopulationByYear.Expr1 pyModgen-specific: The naming operator => is not recognized by Modgen and will produce a syntax error.  For x-compatible model code, use the //NAME syntax.
Explicit names can be specified for dimensions of a parameter. For example, the parameter declaration
double  UnionDurationBaseline[UNION_ORDER][UNION_DURATION];	can incorporate explicit names using the naming operator before a dimension:
double  UnionDurationBaseline
    Order    => [UNION_ORDER]
    Duration => [UNION_DURATION];	or by using the comment-based syntax:
//NAME UnionDurationBaseline.Dim0 Order
//NAME UnionDurationBaseline.Dim1 DurationExplicit names can be specified for an enumerator of a classification. For the classification declaration
classification UNION_ORDER  //EN Union order
{
    UO_FIRST,    //EN First union 				
    UO_SECOND    //EN Second union 				
};an explicit name can be specified using the naming operator before the enumerator:
classification UNION_ORDER  //EN Union order
{
    First  => UO_FIRST,		//EN First union 				
    Second => UO_SECOND		//EN Second union 				
};or by using the comment-based syntax
//NAME UO_FIRST First
//NAME UO_SECOND SecondIf a default name is being used,
a downloaded parameter or table has column names like Dim0 or Dim2,
and table expressions like Expr2 or Expr5,
which are less than helpful for model users.
An issue for model developers is to identify missing explicit names like these,
and, once identified, to insert the missing explicit name in the model code.
The OpenM++ compiler supports a family of options to aid that process.
Each member of the family targets a specific kind of missing explicit name.
When an option is set to on,
the compiler will generate a warning for each missing explicit name of that kind.
The warning includes the model code file and line where the symbol was declared.
In an IDE like Visual Studio,
double-clicking on the warning in the log window
navigates immediately to that model source code location in the IDE editor.
By default these options are off.
Multiple options can be turned on at the same time.
The following example identifies all dimensions and expressions of published tables in RiskPaths
which lack an explicit name.
Inserting the following line in ompp_framework.ompp
options missing_name_warning_published_table = on;causes the compiler to emit warnings like:
1>../code/Tables.mpp(40): warning : missing explicit name for dimension 0 of published table 'T02_TotalPopulationByYear'
1>../code/Tables.mpp(42): warning : missing explicit name for expression 0 of published table 'T02_TotalPopulationByYear'
1>../code/Tables.mpp(43): warning : missing explicit name for expression 1 of published table 'T02_TotalPopulationByYear'
Double-clicking one of these warnings navigates directly to the model code line of the dimension or expression.
The following table lists the available options to emit warnings for missing explicit names, grouped by category. The Scope column shows what produces a warning for the given option.
| Option | Scope | 
|---|---|
| All | |
| missing_name_warning_classification | classification level (enumerator) | 
| missing_name_warning_parameter | dimension | 
| missing_name_warning_table | dimension, expression | 
| Published only | |
| missing_name_warning_published_classification | as above, but only for published symbols | 
| missing_name_warning_published_parameter | as above, but only for published symbols | 
| missing_name_warning_published_table | as above, but only for published symbols | 
A heuristic name is a name which replaces a default name with a name generated by the OpenM++ compiler.
A heuristic name is generated based on contextual information about the dimension or expression,
if no explicit name was provided in model code.
Explicit names are generally preferable to heuristic names.
Heuristic names can provide an immediate improvement in the usability
of downloaded parameters and tables,
replacing default names like Dim2 or Expr5 with something better.
Heuristic names are not generated by default. To generate heuristic names, include the following statement in model source code:
options use_heuristic_short_names = on;The table in the previous example, with no explicit names, would produce the following exported csv:
| expr_name | Curtate_age | expr_value | 
|---|---|---|
| Population_start_of_year | 0 | 5000 | 
| Population_start_of_year | 1 | 5000 | 
| Population_start_of_year | 2 | 5000 | 
| ... | ||
| Population_start_of_year | 99 | 5000 | 
| Population_start_of_year | 100 | 5000 | 
| Average_population_in_year | 0 | 5000 | 
| Average_population_in_year | 1 | 5000 | 
| Average_population_in_year | 2 | 5000 | 
| ... | 
In this example, the OpenM++ compiler generated heuristic names using //EN labels found in the model source code (EN is the default language of this model).
However, the OpenM++ compiler may, particularly if a label exceeds the name length limit, create a heuristic name based on other information, such as the name of the classification underlying the dimension of a table.  To respect name length limits, a heuristic name may be based on a label with an interior portion snipped out and replaced by _X_, or prefixed by X_ so that the name starts with an alphabetic character.
If a heuristic name clashes with the name of a previous dimension or measure, a disambiguating suffix will be appended to the heuristic name.
For example the parameter k_year declared as
parameters {
    YEAR k_year[REGION][REGION];
};has a repeated dimension REGION.
The heuristic name for the second repeated dimension of k_year will be disambiguated by appending Dim1:
// Parameter k_year: k_year
//NAME k_year.Dim0 Region
//NAME k_year.Dim1 RegionDim1The maximum length of heuristic names can be controlled by the following option:
options short_name_max_length = 32;Heuristic name generation for enumerators of classifications can be disabled by the following option:
options enable_heuristic_names_for_enumerators = off;By default, this option is on.
If this option is off, the name of a classification enumerator will always be the same as the enumerator model code name.
This option has effect only if the option use_heuristic_short_names is on.
Dimension and measure names in exported files facilitate direct use in downstream analysis.
For example, a .csv could be opened in Excel and used as a pivot table,
or imported into R or SAS, with meaningful column names.
A wide variety of applications can be used to do downstream analysis,
each with its own name restrictions.
OpenM++ imposes the following restrictions to explicit names to reduce potential problems in downstream analysis:
A name
- has a maximum length in characters given by the option short_name_max_length(default 32)
- has characters in uppercase A-Z, lowercase a-z, digits 0-9, and the _ character
- is unique within the dimensions or expressions of the parameter or table
If a name does not meet these restrictions, the OpenM++ compiler will emit a warning and 'mangle' the name to meet the restrictions, e.g. by replacing forbidden characters by _, by truncating the name, or by appending a trailing numeric suffix to disambiguate identical names.
If Default values for a parameter are provided using a .csv file,
any name used in the file must correspond to the corresponding external name in the model.
The same applies to uploads of parameter data in .csv files,
or to parameters supplied programmatically using an external script.
Any name generated or modified by the OpenM++ compiler is written to a file named
GeneratedNames.ompp in the compiler output directory,
which in Windows is MODEL/ompp/src/GeneratedNames.ompp.
GeneratedNames.ompp does not contain explicit names given in model source code using => or //NAME,
unless for some reason the OpenM++ compiler needed to modify them.
The content of GeneratedNames.ompp uses //NAME statements to make it suitable as a starting point to specify explicit names in model source code, for example in a separate source code module named code/ExplicitNames.ompp, or perhaps immediately following the declaration of a classification, parameter or table.
Here is an extract of src/GeneratedNames.ompp from the RiskPaths model:
// Parameter AgeBaselineForm1: Age baseline for first union formation
//NAME AgeBaselineForm1.Dim0 X_2_5_year_age_intervals
// Parameter AgeBaselinePreg1: Age baseline for first pregnancy
//NAME AgeBaselinePreg1.Dim0 X_2_5_year_age_intervals
// Parameter ProbMort: Death probabilities
//NAME ProbMort.Dim0 Simulated_age_range
// Table T01_LifeExpectancy: Life Expectancy
//NAME T01_LifeExpectancy.Expr0 Total_simulated_cases
//NAME T01_LifeExpectancy.Expr1 Total_duration
//NAME T01_LifeExpectancy.Expr2 Life_expectancy
// Table T02_TotalPopulationByYear: Life table
//NAME T02_TotalPopulationByYear.Dim0 Curtate_age
//NAME T02_TotalPopulationByYear.Expr0 Population_start_of_year
//NAME T02_TotalPopulationByYear.Expr1 Average_population_in_year
// Table T04_FertilityRatesByAgeGroup: Fertility rates by age group
//NAME T04_FertilityRatesByAgeGroup.Dim0 Age_interval
//NAME T04_FertilityRatesByAgeGroup.Dim1 Union_Status
//NAME T04_FertilityRatesByAgeGroup.Expr0 Fertility