# How To: Uncertainties (for VBF H(inv) analysis) - alpakpinar/bucoffea Wiki

In general, uncertainties are computed within the analysis processors by doing the following: Update the event weights and/or event selection, to represent the uncertainty source, and compute the impact on the shape of the fitting variable.

This section is meant to be a guide to how to compute several important sources of uncertainties using `bucoffea`. For the time being, the instructions are limited to the VBF H(inv) analysis.

## Jet Energy Uncertainties

Jet energy uncertainties represent the uncertainties for the jet energy corrections and the smearing factors we apply to the jets. In practice, the uncertainties due to these are computed by varying the jet and MET momenta up and down (due to each uncertainty source), and computing the impact on the shape of the fitting variable.

### Computing Jet Energy Uncertainties

The implementation for jet energy uncertainties are found within `15Apr21_ReReco_UL_JES` branch. So if you're doing this computation, please check out this branch and pull the latest code.

Within this branch, the `vbfhinvProcessor` object is updated so that instead of running over all regions only once (the nominal case), it runs multiple times for each variation. The list of variations are specified here. The code looks like this:

``````# jerUp and jerDown -> Up/down variations of jet energy resolution (JER)
# jesTotalUp and jesTotalDown -> Up/down variations of jet energy scale (JES)
self._variations = [
'', # This is the nominal case
'_jerUp',
'_jerDown',
'_jesTotalUp',
'_jesTotalDown',
]
``````

Note that the list of variations can be changed in a flexible way, due to this design (e.g. if we want to split the JES uncertainties).

This way, when the VBF H(inv) processor is executed, it will define separate regions for each variation, and will run over all regions. For example, for signal region, the processor will now run over:

• The nominal signal region
• The signal region with JER up/down variations
• The signal region with JES up/down variations

This way, each histogram will have all the variations saved, and they can be used to compute the variations with respect to the nominal case (next section). Once the variations are specified as above, the user can simply run the `vbfhinv` processor as usual, as explained here.

### Plotting the Jet Energy Uncertainties

Once the processor is ran and the outputs are merged into one accumulator (see here), a plotting script can be run to plot the uncertainties for a given dataset. This plotting script is located on this path: `plot/studies/vbf_uncertainties/jet_energy` and it is called `plot_jes_variations.py`. It does two things:

• Plots the variations and the ratio of variations to nominal case
• Saves the ratios in an output ROOT file (to be used later in the fit)

With the merged accumulator ready, the user can simply execute:

``````# Go to the directory
cd plot/studies/vbf_uncertainties/
./plot_jes_variations.py /path/to/the/merged/accumulator
``````

## Other Experimental Uncertainties

Most other experimental uncertainties (including variations in prefire SF, pileup SF etc.) are built in to the VBF H(inv) processor class. All the user needs to do is to configure which uncertainties to run from the configuration files, using the fields here. The syntax for each uncertainty looks like this:

``````uncertainties:
prefire_sf: True
btag_sf: True
...
``````

Each flag specifies one of the uncertainty sources, setting a flag `True` will run the uncertainty source, and save the results of up and down variations into the `cnn_score_unc` variable, corresponding to the uncertainties on the score distribution. See here for an example implementation in the `vbfhinvProcessor` class.

### Plotting the Uncertainties

Once the merged accumulator with the saved uncertainties are produced, these uncertainties can be plotted via the `plot_uncertainties.py` script, located here. The usage of this script is simple, it has a `dataset` and an `uncertainties` variable, keeping track of the dataset, and the list of uncertainties to plot, and save into a ROOT file for later use in the fit. The implementation can be found here. This script can be executed by pointing it to the merged accumulator input:

``````./plot_uncertainties.py /path/to/the/merged/input/accumulator
``````

This will create plots and ROOT files per uncertainty source (as specified in `uncertainties`), under the `output` directory. Note that by default, this script will plot and save the uncertainties as a function of the CNN score, i.e. it will use the `cnn_score_unc` histogram. If you want to plot the variations as a function of `mjj` instead, you can specify a `-v` (or `--variable`) argument to the script:

``````# Plot and save the uncertainties as a function of mjj
./plot_uncertainties.py /path/to/the/merged/input/accumulator -v mjj
``````

Note that supported options for `-v` argument are only `mjj` and `cnn_score` at the moment.

## Theory Uncertainties

Theory uncertainties, such as the scale and PDF uncertainties, are also computed within the `vbfhinvProcessor` class (see here). The set of uncertainties and the ROOT histograms containing each variation is stored in `vbfhinv.yaml` file, starting from here. The variations of `Z(vv)/W(lv)` and `gamma/Z(vv)` ratios are computed as a function of generator-level boson pt, these variations in weights are applied, and variations in the `mjj`, `cnn_score` and `dnn_score` distributions are saved.

### Saving the Theory Uncertainties into a ROOT file

Given the merged accumulator, after running the VBF H(inv) processor and merging the output `.coffea` files, the theory uncertainties on V+jets transfer factors can be easily saved into a ROOT file. This ROOT file will also be later used in the fit framework, to save the uncertainty shapes to `combine` workspace. To be precise, four uncertainties on the transfer factors are computed here:

• Renormalization scale (`mu_R`)
• Factorization scale (`mu_F`)
• PDF uncertainty
• NLO EWK correction uncertainty

These uncertainties on V+jets transfer factors can be saved by using the `make_wz_uncertainties.py` script, located under `plot/studies/theory_uncertainties`. Just point this script to the location of the merged `.coffea` files, together with a few additional command line arguments, specifying the variable to save the uncertainties as a function of (default is `cnn_score`), and the years to run on (default is to run both 2017 and 2018).

The script can be used as follows:

``````# Run 2017 only
./make_wz_uncertainties.py /path/to/merged/acc -v cnn_score -y 2017

# Default is to run for 2017 and 2018 (will fail if you're missing data!)
./make_wz_uncertainties.py /path/to/merged/acc -v cnn_score
``````

At the end, this script should output three ROOT files, with the uncertainties saved in the `vbf_z_w_gjets_theory_unc_ratio_unc.root` file (which will be used in the fit). You can read the next sub-section for plotting the uncertainties on these ratios.

### Plotting the Theory Uncertainties

Once the ROOT file with the uncertainties is produced from the previous step, plotting them is easy! In the same directory, you can use the `plot_uncertainties.py` script, and point it to the `vbf_z_w_gjets_theory_unc_ratio_unc.root` file produced earlier. Similar to the other script, this takes a `-y` (or `--years`) argument which will specify the years to run on.

The script can be executed as follows:

``````./plot_uncertainties.py /path/to/root/file
``````

The script should output the plots of uncertainties as PDF files in the same directory as the ROOT file, under the `plots` sub-directory.