Tutorial II: Parameter Expansions - KIT-CMS/Artus GitHub Wiki
This tutorial is based on the ROOT files generated in the first step.
Many parameters of HarryPlotter take lists of multiple elements as values, where the number of elements is not restricted. Two examples: Most of the parameters of the InputRoot
module take as many elements as values as input objects have to be read in. Similarly, most of the plotting settings in the plot modules, e.g. PlotRoot
take as many elements as values as objects have to be plotted. The --help
option indicates clearly these list-type parameters by listing multiple possible values. The long forms of the parameter names also should contain the plural-"s".
HarryPlotter has to ensure, that all list that belong together, have the same number of parameters. The parameters that belong to such an expansion group are usually all list-type parameters of a certain parameters group, e.g. the Input options
.
harry.py --help | grep "Input options" -A 75
In case the specified parameter values do not match in terms of list length, HappyPlotter expands the shorter lists to match the size of the longest list. These expansions are performed by the function Processor.prepare_list_args
, which provides extensive debug output for all steps and prints warnings in case something unintentionally might occur. Therefore all expansions can be found by searching for calls of these function.
Example: expansion in InputRoot
In the InputRoot
module all list-type parameters regarding the inputs are expanded:
nicks
directories
files
folders
friend_trees
x_expressions
y_expressions
z_expressions
x_bins
y_bins
z_bins
weights
tree_draw_options
scale_factors
Only one item - no expansion
In case of the most simple example
harry.py --log-level debug -i gaussians3.root -f gaussians1 -x var0
the following debug information is printed out.
Argument list expansion: InputRoot options
Item 0:
nicks -> None
x_expressions -> var0
y_expressions -> None
z_expressions -> None
x_bins -> None
y_bins -> None
z_bins -> None
scale_factors -> 1.0
files -> ['gaussians3.root']
directories -> None
folders -> ['gaussians1']
weights -> 1.0
friend_trees -> None
tree_draw_options ->
As the maximum list length is 1, there is only one item to iterate (Item 0
). The specified values, files -> ['gaussians3.root']
, folders -> ['gaussians1']
and x_expressions -> var0
are recognised as expected. The values for the other parameters remain empty or, if available, at the given default value.
Iteration over only one parameter
In most cases, multiple graphs have to be shown in a single plot that differ in only one aspect. In the next example three different branches are plotted from the same tree in the same file.
harry.py --log-level debug -i gaussians3.root -f gaussians1 -x var0 var1 var2
Then the debug output lists three iterations.
Argument list expansion: InputRoot options
Item 0:
nicks -> None
x_expressions -> var0
y_expressions -> None
z_expressions -> None
x_bins -> None
y_bins -> None
z_bins -> None
scale_factors -> 1.0
files -> ['gaussians3.root']
directories -> None
folders -> ['gaussians1']
weights -> 1.0
friend_trees -> None
tree_draw_options ->
Item 1:
nicks -> None
x_expressions -> var1
y_expressions -> None
z_expressions -> None
x_bins -> None
y_bins -> None
z_bins -> None
scale_factors -> 1.0
files -> ['gaussians3.root']
directories -> None
folders -> ['gaussians1']
weights -> 1.0
friend_trees -> None
tree_draw_options ->
Item 2:
nicks -> None
x_expressions -> var2
y_expressions -> None
z_expressions -> None
x_bins -> None
y_bins -> None
z_bins -> None
scale_factors -> 1.0
files -> ['gaussians3.root']
directories -> None
folders -> ['gaussians1']
weights -> 1.0
friend_trees -> None
tree_draw_options ->
The single difference between the three iterations are the values of x_expressions
. All other list elements have been duplicated three times. Similiarly, an iteration over different trees would work.
harry.py --log-level debug -i gaussians3.root -f gaussians1 gaussians2 gaussians3 -x var0
Simultaneous iteration over multiple parameters
Now the var0
values from one tree inside each file has to be plotted. A loop just over all files is not sufficient since the trees in the files are named differently.
for file in gaussians*.root; do echo "$file contains:"; get_root_file_content.py $file; echo; done
gives
gaussians.root contains:
gaussians (TTree)
gaussians1000.root contains:
gaussians (TTree)
gaussians3.root contains:
gaussians1 (TTree)
gaussians2 (TTree)
gaussians3 (TTree)
The plotting command now is
higgsplot.py --log-level debug -i gaussians.root gaussians3.root gaussians1000.root -f gaussians gaussians1 gaussians -x var0
Then the debug output lists again three iterations as the maximum number of elements of all parameters is 3.
Argument list expansion: InputRoot options
Item 0:
nicks -> None
x_expressions -> var0
y_expressions -> None
z_expressions -> None
x_bins -> None
y_bins -> None
z_bins -> None
scale_factors -> 1.0
files -> ['gaussians.root']
directories -> None
folders -> ['gaussians']
weights -> 1.0
friend_trees -> None
tree_draw_options ->
Item 1:
nicks -> None
x_expressions -> var0
y_expressions -> None
z_expressions -> None
x_bins -> None
y_bins -> None
z_bins -> None
scale_factors -> 1.0
files -> ['gaussians3.root']
directories -> None
folders -> ['gaussians1']
weights -> 1.0
friend_trees -> None
tree_draw_options ->
Item 2:
nicks -> None
x_expressions -> var0
y_expressions -> None
z_expressions -> None
x_bins -> None
y_bins -> None
z_bins -> None
scale_factors -> 1.0
files -> ['gaussians1000.root']
directories -> None
folders -> ['gaussians']
weights -> 1.0
friend_trees -> None
tree_draw_options ->
But now both the files
and folders
parameters are varied. The mapping of the parameters values in each iteration follows exactly the same sequence as specified in the program arguments.
Warnings of possibly unintentional behaviour
The same plot is produced by
higgsplot.py -i gaussians.root gaussians3.root gaussians1000.root -f gaussians gaussians1 -x var0
but the following warning is given
WARNING: Parameters 'nicks', 'x_expressions', 'y_expressions', 'z_expressions', 'x_bins', 'y_bins', 'z_bins', 'scale_factors', 'files', 'directories' require parameter list length of 3. Parameters 'folders'(2) will be replicated to match required length.
HarryPlotter repeats the elements of lists that are shorter than the longest list. In case, shorter lists contain only one element, that behaviour is trivial: this element gets duplicated until the number of elements corresponds the length of the longest list (see section Iteration over only one parameter). In case the shorter contains more than one element, the behaviour is still well defined: This full list gets duplicated until the number of elements corresponds the length of the longest list. Example: The shorter list contains two elements a
and b
and the longest list five elements. This results in an expansion of the shorter list to [a, b, a, b, a]
. These elements get then matched with the ones of the longest list in the same ordering. In most cases, when a parameter is configured with more than one but less than the number of elements of the longest list, the resulting behaviour is not intended by the user and HarryPlotter assumes a misconfiguration. Therefore the warning is printed.
In the example above, the warning says that the paramters files
and others have a length of 3 (or could be trivially expanded) and the parameter folders
has only two values and therefore gets expanded.
The following example shows a misconfiguration
higgsplot.py -i gaussians3.root gaussians.root gaussians1000.root -f gaussians1 gaussians -x var0
Again, the same warning is given. Debug options --log-level debug
reveal
Item 2:
nicks -> None
x_expressions -> var0
y_expressions -> None
z_expressions -> None
x_bins -> None
y_bins -> None
z_bins -> None
scale_factors -> 1.0
files -> ['gaussians1000.root']
directories -> None
folders -> ['gaussians1']
weights -> 1.0
friend_trees -> None
tree_draw_options ->
that in the last item again the first configured folder
(gaussians1
) is matched to the iteration with files -> [gaussians1000.root]
. As this file does not contain a tree named gaussians1
, the configuration is wrong and an error is thrown later.
ERROR: Could not find ROOT object "gaussians1" in file "gaussians1000.root"! (roottools.py: line 71)
CRITICAL: Error getting ROOT object from file. Exiting. (inputroot.py: line 153)
In other cases, the run of HarryPlotter is still well defined and does fail. However, a possible misconfiguration might be more difficult to notice without this warning.