The ABCD Method Code - jniedzie/SVJanalysis_wiki GitHub Wiki

Setup

In oder to run the ABCD method code, one must first setup the 2 following files:

  • runABCD.py:

    • set the location/name of the input histograms
    • set the name of the variables to be used for the calculations
    • set the name of the output folder
    • toogle if the ABCD method is run on each signal individually, or on multiple signals
  • config.py:

    • toggle the rebin
    • set the search region for the optimization
    • define/toggle the constraints of the optimization
    • (Beta and potentially deprecated) In the case one wants the distribution of MET_pt, but it isn't one of the 2 variables selected in the histograms. One needs to set the location of the raw data files to create those 1D histograms. (This functionnality could be removed, and done by hand or in a standalone program.)

Running the code

Launch runABCD.py with the following parameters:

  • "-m" : “both” or “hist” or “abcd”. Will respectively, run the full abcd method, or only create the histograms, or only do the optimization (assumes the histogram have already been created)
  • "-cut1", "–cut2": (Optional) Will use those value of the cut instead of optimizing. As of now it is not possible to optimize on only one variable

For example:

python runABCD.py -m both -cut1 0.5 -cut2 500

Steps of the code

The ABCD method is completed by running 2 programs one after the other

  1. Run makeHistogramForABCD.py to create histograms that are equivalent to applying the ABCD method on every possible pair of cuts
  2. Run applyABCDMethod.py to get the result of the ABCD method for a given cut. The program will find optimal cuts if cuts are not inputed. Otherwise it will use the inputed cuts

Warning: As a default applyABCDMethod.py assumes that the histogram are stored in the folder where makeHistogramForABCD.py creates them.

Description of the files

applyABCDMethod.py

Apply the ABCD method using either user inputed cuts or optimal cuts. The optimal cuts are found by maximizing the significance in the signal region under constraint.

The optimization is implemented in significanceUtilities.py.

The optimization constraints are defined in the config file.

Outputs:

  • ABCD plot
  • Validation in D+B regions
  • Distribution of the x and y axis
  • Distribution of the x and y axis with a cut on dnnScore
  • Distribution of the x and y axis in each region
  • Shape of the prediction
  • Root file for the limit code

config.py

Config file containing various parameters, most importantly concerning:

  • the rebinning of the histogram
  • the optimization of the significance

makeHistogramForABCD.py

Resample the histograms and calculate the following 2D histograms:

  • significance_A

  • predicted_bkg_A

  • error_prediction

  • error_relative_uncertainty

  • bkg_A

  • bkg_B

  • bkg_C

  • bkg_D

  • sgn_A

  • sgn_B

  • sgn_C

  • sgn_D

  • signal_contamination_A

  • signal_contamination_B

  • signal_contamination_C

  • signal_contamination_D

  • background

  • signal

For each signals, those histograms are stored in a specific rootfile in a folder anamed after the signal.

Since we want the signal to all use the same background, the resampled histogram of the background is created and stored in a rootfile for the first signal. Each susequent signal load this root file insead of resampling the background again. The resampled background file recreated each time we runABCD.py is runned.

runABCD.py

Runs the ABCD method iteratively on each signal contained in the file.

Takes 3 arguments, 2 of which are optional.

1st argument is required and set what is going to be run:

  • hist: create the histogram by running makeHistogramForABCD.py.
  • abcd: assumes that the histogram already exist and run applyABCDMethod.py.
  • both: run hist and abcd one after the other.

The 2nd and 3rd arguments are the cuts of the first and second variables, respectively.

Warning: As a default applyABCDMethod.py assumes that the histogram are stored in the folder where makeHistogramForABCD.py creates them.

The output directory is where all the results are stored. Inside there will be folders named according to the histogram name of the variables. Inside of those there will be a folder for each signal. Those folders contain the plots and the root files created by makeHistogramForABCD.py.

significanceUtilities.py

Implement the calculation of the significance, and the grid search maximization of the significance.