bpkUtility - ntuhep/bprimeKit GitHub Wiki

bpkUtility

This package mainly has two purposes:

  1. Since the code of the bprimeKit object classes is huge, python scripts are written for generating the header files to reduce the effects of human error.
  2. It is a bit annoying to rewrite the skeleton code for checking the contents of MiniAOD files everytime. Skeleton codes are provided in the /bin directory.

Installation

The libraries in this scripts is stand-alone, depending only on the libraries that already exist in CMSSW. In a existing CMSSW working environment, run the commands

cd CMSSW_X_Y_Z/src/
git clone https://github.com/ntuhep/bpkUtility.git bpkFrameWork/bpkUtility

scram b

Code generation scripts

The main control flow of the code generation scripts are all found in the scripts directory, and should be copied to the CMSSW/bin directory, so should should be able to call the scripts directly like:

bpk_UpdateHLTList.py --help

Notice you will have to recompile with scram b every time you make alterations to the code in script directory, alternatively, you can simple execute the code like:

scripts/bpk_UpdateHLTList.py --help

bpk_MakeFormat.py

The scripts reads the inputs from .csv files located in bpkUtility/data/ and creates a master format.h file containing the required InfoBranches classes, including the Register() and RegisterTree() member functions for loading a tree.

The .csv file should be in a four column format with commas used as the column delimiters. The first rows is import for creating the python type used to store the table, and should not be arbitrarily changed. Using some row of the EvtInfoBranches.csv as example:

Datatype Varname Size Comment
Int_t RunNo
ULong64_t EvtNo
Int_t nTrgBook
Char_t TrgBook N_TRIGGER_BOOKINGS
Int_t HLTPrescaleFactor 512

This table itself is stored as a a list of dictionaries with the column header as keys. So the table above would be equivalent to:

table = [
  {"Datatype":"Int_t","Varname":"RunNo","Size":None,"Comment":None},
  #...
  {"Datatype":"Int_t","Varname":"HLTPrescaleFactor","Size":"512","Comment":None},
]

This is table is then used to generate the code of the format.h. Which is split up into three parts, mainly listed in the bpkUtility/python/variableListing.py file.

Variable declaration

This is a simple case of listing all the variables, taking care for different formats for array and non array declarations. For details, see MakeVariablePart()

Register() member creating.

The Register() member function is to link the a string associated with the variable in a TTree object, to a concrete pointer in the C++ class for reading via the TTree::SetBranchAddress( char* const, void* ) method.

Creation of the code for the Register() member function is implemented in the variableListing.py file in the MakeRegisterFunction() function. This function is also pretty straight forwards. With different cases written for the array and non-array variables. Also. there is a special case to handle for the vector<> objects used for the sub-jet information. Where pointers and objects are declared separately.

RegisterTree() member creating.

The RegisterTree() member function is used to let a concrete pointer in the C++ code be tracked by the TTree object for writing via the TTree::Branch() method.

The python code for crating this RegisterTree() is made implemented in the variableListing.py file in the MakeRegisterTreeFunction() function. This function has various sub-function to help with the TTree input.

  • The branchname is generated in the same manner as the Register() member function.
  • Special atoms could be added in the branchname for type handling. See the MakeTypeToken() function and the TTree documentation for more information
  • Special atoms could be added to aid with array size optimization. The array size could actually be store as a variable in the TTree, which is handy. Due to various array type required by the various branches, see the MakeLeaf() function for more information.

bpk_UpdateHLTList.py

The status of HLTs in the bprimeKit is handled by the array EvtInfoBranches::TrgBook: for each HLT, there is a single 8bit storing the status of the HLT, so there needs to be an converter for the HLT name stored in the MiniAOD format (string) to a array index, this converter is basically the purpose of the TriggerBooking.h file, which is just a list of HLT strings and a numbers. Since the HLT updates continuously with data collection, the HLT list will also need to be updated without change the order of existing HLTs in the list.

A fwlite C++ code is written in src/GetHLTNames to return all the HLT names in a event as a vector of string. This scripts is the ported over to python using the boost::python in the plugins directory, and python is used to compare an existing HLT name list stored in data/HLTList.asc with the returned list, and used to create a new TriggerBooking.h file, as well as update the data/HLTList.asc file list.