Running over TTrees - TC01/Treemaker GitHub Wiki

Running over TTrees

Treemaker originated as a piece of software for turning EDM ntuples into ROOT TTrees. But during a 13 TeV analysis, we were using someone else's trees and wanted to be able to write a treemaker that processed those trees and produced different, more specialized trees for our analysis. The original Treemaker codebase was not capable of doing this.

Treemaker v1.2 adds the concept of multiple "multiple input types" to solve this problem. This

Note that by default, the input type is set to "Ntuple".

The [input] section

Configuration files have gained a new (optional) section for specifying an input file. The default section will look like this:

[input]
input_type = Ntuple
source_tree_name =

At the moment, there are only two valid input types; "Ntuple" and "Tree". If input_type is set to "Ntuple", the source_tree_name parameter will be ignored. If input_type is set to "Tree", however, source_tree_name is expected to be the name of a TTree in the ROOT files you ask Treemaker to run over.

Note that if the tree is located in a folder inside the ROOT file, the name includes the full path, so for instance if your tree is called "X" and is in a folder called "Y", the tree name you would use is "Y/X".

Running treemaker-config

The treemaker-config tool can automatically create this section with the right parameters. If you want to modify them, however, use the following options:

treemaker-config -i INPUT_TYPE -s SOURCE_TREE $(other options)

The parameters should be self-explanatory, but INPUT_TYPE is either "Tree" or "Ntuple" and SOURCE_TREE is the name of the source tree to process as described above.

Writing Tree Plugins

Plugins intended to run on Trees instead of Ntuples differ in two important aspects:

They declare input_type = "Tree" instead of input_type = "Ntuple" (the input_type variable defaults to "Ntuple" if not present), to tell Treemaker that these plugins should only be allowed to run when Treemaker is running in tree mode.
The structure of the labels (also known as "leaves" in tree mode) structure is different.

Unlike a ntuple's labels dictionary, which is a two-dimensional structure containing a complete map of the ntuple's entries, the leaves dictionary is a one-dimensional structure. Keys into leaves are the name of a branch of the input tree. The value associated with these keys are the entries in the tree-- usually (if not always) an array. Take a look at the following example code:

	numAK8s = leaves['jetAK8_size']
	tagJet = -1
	for i in range(min(numAK8s, 4)):		
		if leaves['jetAK8_prunedMass'][i] > 50 and leaves['jetAK8_Pt'][i] > 200 and math.fabs(leaves['jetAK8_Eta'][0]) < 2.1:
			tagJet = i
	if tagJet != -1:
		variables['tagJetPt'][0] = leaves['jetAK8_Pt'][tagJet]

This block of code (taken from the "zprime_tagjet.py" example plugin) reads a variable from the source tree (that is not an array). In this case, the variable is the number of AK8 jets. A for loop figures out which jet index should be tagged, and then copies the Pt of that jet from the source tree over to the new tree.