VBF Variables - HiggsToolsRepo/VBFs_MC GitHub Wiki

Tagging VBF topology

  • Centrality (the name has to be checked here)
  • Leading and sub-leading jet transverse momentum, pseudo-rapidity rapidity, phi or each jet, in here we can define the delta phi, delta eta
  • the delta R could be a combination of these last two variables
  • lepton variables, eta, phi, pt, charge
  • the related combination, delta eta, delta phi
  • the Met info: Met, phi(Met) more complex variables:
  • Create categories with N jets, for that we need to have this variables in the tree
  • The minimum distance between a jet (leading or subleading) from a one of the selected leptons. Here I was thinking on something like:
dijet_minDRJetLep_ = std::min( std::min(deltaR( jet_1 ,lep_1 ),
                                        deltaR( jet_2 ,lep_1 )),
                               std::min(deltaR( Jet_1 ,lep_2 ),
                                        deltaR( Jet_2 ,lep_2 ))         
                             );

pretty semilar to the centrality, but this variables was used by atlas for the VBF topology

  • Zepenfeld variables defined as Zep = | ((eta_1 + eta_2)/2) - eta(lepton system) |, where the eta(lepton system) is eta(p4(lep1) + p4(lep2))
  • we can discuss about how to define new variables using also the missing transverse energy and for that I though on this variable: |phi(lep1+lep2) + phi(Met)| the phi of the Met is something we can have
  • There is also another variable that is used in higgs to gamma gamma that we can adapt for our purposes: |phi (jet1+jet2,lep1+lep2)| we can put those angle in a cos(), maybe is easier to understand and to plot also.
  • invariant mass of the dilepton system and the dijet system

note: all of these variables can go in an ML (machine learning) technique to discriminate QCD/Signal. I'm currently taking care of providing these tools. For that, we need an easy access to the data (other than the stdhpe, hep or hepmc format). And if we want to use what CMS and Atlas such as TMVA, I need to have the data, in a root file, or in a format from wich I can create those root files. There is other alternatives for ML, but I need to have the data in a specific format such as (CSV). If we can produce a text file in which each line contains the values of all the listed variables, that will be sufficient to train an ML algorithm.