Gap analysis - eclipse-efbt/efbt GitHub Wiki

Gap Analysis

The gap analysis process checks that each variable used in each BIRD Reference Output Layer (ROL) cube has a corresponding variable available in the BIRD Enriched Input Layer cubes. For EBA Finrep regulation there will be one reference output layer cube per report.

The ROL cube is derived from the EBA Annotated Templates by applying the EBA-to-SDD mappings, so we know that it truly reflects regulatory requirements.

The Gap analysis process answers three questions:

  1. For each cube_structure_item in each ROL cube, does a corresponding item exist in the input layer, or is there a gap?
  2. For each cube_structure_item in each ROL cube, which financial products can it be linked to?
  3. For each cube_structure_item in each ROL cube and its relevant financial-products, where is the corresponding item in the input-layer cubes?

The analysis is contextual. It does not simply ask whether a variable exists somewhere in the input layer. It asks whether the variable exists in the correct product context. For example, CRRYNG_AMNT for other loans may be found in INSTRMNT_RL, while CRRYNG_AMNT for non-negotiable securities may be found in a different part of the input model realtd to securities.

Determining the product context

To understand which financial products a variable is required for, the gap anlysis the combination associated with the relevant data point.

F 04.01 excerpt showing 11086_REF and 11088_REF

Example: 11086_REF

For 11086_REF, the combination shows how the metric and product context are identified.

Combination for 11086_REF

The MTRCS variable is a special case in this combination. It represents the metric for which we need a corresponding input-layer variable. In this example, the metric is CRRYNG_AMNT.

he TYP_INSTRMNT variable in the same combination has the member Debt securities (210). This tells us that we need to search for CRRYNG_AMNT in the input tables associated with debt securities.

Configuration files

The gap analysis uses two configuration files to turn a product condition into a concrete search path through the input model.

The first configuration file maps product conditions to a product category and slice. For example, the debt-securities condition is mapped as follows:

Condition Main category Slice name
TYP_INSTRMNT=TYP_INSTRMNT_210 Debt securities Debt securities

Product-mapping configuration excerpt

The second configuration file maps each product or slice to the relevant input tables:

Product / slice Main table Filter Related tables
Debt securities LNG_BLNC_SHT_RCGNSD_SCRTY_PSTN_PRDNTL_PRTFL_ACCNTNG_CLSSFCTN_ASSGNMNT SCRTY_PSTN SCRTY_ENTTY_RL_ASSGNMNTENTTY_RLSCRTY_EXCHNG_TRDBL_DRVTVLNG_BLNC_SHT_RCGNSD_SCRTY_PSTN_PRDNTL_PRTFL_ASSGNMNTLNG_SHRT_BLNC_SHT_RCGNSD_SCRTY_PSTNBLNC_SHT_RCGNSD_NN_BLNC_SHT_RCGNSD_SCRTY_PSTNSCRTY_PSTNSCRTY_EXCHNG_TRDBL_DRVTVPRTY

Table-mapping configuration excerpt

Although this example uses a single condition (TYP_INSTRMNT=TYP_INSTRMNT_210), the same mechanism can support multiple filters when needed.

The first version of these configuration files was provided by the banks involved in the BIRD working group on FINREP transformations.

Different data points can use different contexts

Data point 11088_REF follows the same general process, but it points to a different product context and therefore to a different set of input tables.

The key point is that the analysis is not asking only, “Does CRRYNG_AMNT exist in the input tables?” It is asking, “Where does CRRYNG_AMNT exist for this specific product context?” The search is directed by the configuration files.

The first version of these configuration files was provided by the banks involved in the BIRD working group on FINREP transformations.

Cases with more than one table set

A small number of conditions can lead to more than one set of input tables. These cases are handled by adding multiple entries to the first configuration file.

Derivatives

For combinations where TYP_INSTRMNT=TYP_INSTRMNT_310 (derivatives), the BIRD input layer stores derivatives in two areas:

  • OTC derivatives are stored in input cubes linked to INSTRMNT.
  • ETC derivatives are stored in input cubes linked to EXCHNG_TRDBL_DRVTV_PSTN.

The first configuration file therefore has two entries:

Condition Main category Slice name
TYP_INSTRMNT=TYP_INSTRMNT_310 Derivatives Derivatives ETC
TYP_INSTRMNT=TYP_INSTRMNT_310 Derivatives Derivatives OTC

Both entries are described in the second configuration file:

Product / slice Main table Filter Related tables
Derivatives ETC EXCHNG_TRDBL_DRVTV_PSTN EXCHNG_TRDBL_DRVTV_PSTN PRTYSCRTY_EXCHNG_TRDBL_DRVTVEXCHNG_TRDBL_DRVTV_PSTN_RLENTTY_RLEXCHNG_TRDBL_DRVTV_PSTNSCRTY_ENTTY_RL_ASSGNMNT
Derivatives OTC INSTRMNT OTC_DRVTV_INSTRMNT INSTRMNT_RLPRTYINSTRMNT_ENTTY_RL_ASSGNMNT

Non-negotiable debt securities

A more subtle FINREP case is that non-negotiable debt securities should be treated as term loans. In BIRD, term loans are recorded as other loans. The first configuration file therefore contains two entries for TYP_INSTRMNT=TYP_INSTRMNT_114:

Condition Main category Slice name
TYP_INSTRMNT=TYP_INSTRMNT_114 Other loans Other loans
TYP_INSTRMNT=TYP_INSTRMNT_114 Other loans Non Negotiable bonds

The second configuration file then describes both products:

Product / slice Main table Filter Related tables
Other loans INSTRMNT OTHR_LN INSTRMNT_RLPRTYINSTRMNT_ENTTY_RL_ASSGNMNTCLLTRL
Non Negotiable bonds LNG_BLNC_SHT_RCGNSD_SCRTY_PSTN_PRDNTL_PRTFL_ACCNTNG_CLSSFCTN_ASSGNMNT SCRTY_PSTN SCRTY_ENTTY_RL_ASSGNMNTENTTY_RLSCRTY_EXCHNG_TRDBL_DRVTVLNG_BLNC_SHT_RCGNSD_SCRTY_PSTN_PRDNTL_PRTFL_ASSGNMNTLNG_SHRT_BLNC_SHT_RCGNSD_SCRTY_PSTNBLNC_SHT_RCGNSD_NN_BLNC_SHT_RCGNSD_SCRTY_PSTNSCRTY_PSTNSCRTY_EXCHNG_TRDBL_DRVTVPRTY

This preserves the distinction between the “other loans” and “non-negotiable bonds” search paths instead of flattening them into a single route through the model.

Cube links and implementation

For a data point such as 15289, the gap analysis produces cube_links per product. In an implementation, the tables referenced by the relevant cube_links are joined together to create the transformation path.

Because the analysis is performed product by product, cube_links must also be stored per product. Links from different products should not be merged, because merging them loses information. In the published BIRD content these links are currently merged, which is why BEP 001 has been raised.

When the analysis identifies a genuine gap, meaning no corresponding input-layer item can be found in the required context, the gap can be raised with the BIRD data modelling team.

Why the analysis matters

A completed gap analysis is important because it shows that every true requirement from the EBA Annotated Templates can be traced to a location in the input data model. This provides confidence that the input data model has the detail and completeness required to support the reporting requirements.

It also shows that the input model is being designed and grown from real requirements in a methodical way.

Relationship to EIL-to-ROL transformations

Because of the design of the BIRD transformation rules, the output of the gap analysis provides more than lineage: the links also describe the EIL-to-ROL transformations.

This works because:

  • EIL and ROL use the same dictionary.
  • Generation rules select data and reshape it. They do not derive or calculate variables; derivations are handled by derivation rules.
  • The result of the gap analysis is stored as cube_links.

Additional example

The following entry illustrates the same configuration pattern for an equity-instruments security context:

Product / slice Main table Filter Related tables
Equity instruments security SCRTY_EXCHNG_TRDBL_DRVTV EQTY_FND_SCRTY PRTYLNG_SCRTY_PSTN_PRDNTL_PRTFL_ASSGNMNTSCRTY_PSTNSCRTY_ENTTY_RL_ASSGNMNT

Visualising the cube links in EFBT:

A mentioned, In EFBT, cube links are normally interpreted in the context of a financial product. For example, the CRRYNG_AMNT item needed for Other loans in FINREP F 05.01 is linked to CRRYNG_AMNT in the INSTRMNT_RL enriched input cube. The same CRRYNG_AMNT item needed for non-negotiable bonds is linked to a different security-related enriched input cube.

Cube links for the Other loans product context.

Figure: Cube links for the Other loans product context.

Cube links for the non-negotiable bonds product context.

Figure: Cube links for the non-negotiable bonds product context.

The wider view below shows that a FINREP F 05.01 Reference Output Layer may need information from multiple cubes. A working BIRD implementation must join those cubes to build a flat structure for the report.

A report template may require data from multiple input/enriched input cubes.

Figure: A report template may require data from multiple input/enriched input cubes.

EFBT performs those joins. In the example below, separate joins are created for Other loans and non-negotiable debt securities. The purple box represents report cell 152589 in FINREP F 05.01. The cyan boxes are product-specific Reference Output Layer instances for the report. The green tables are Input Layer or Enriched Input Layer cubes.

EFBT product-specific joins for FINREP F 05.01.

Figure: EFBT product-specific joins for FINREP F 05.01.

The column lineage below corresponds directly to the cube links.

Column lineage corresponding to cube links.

Figure: Column lineage corresponding to cube links.

The lines between the report datapoint and the Reference Output Layer represent the filters and aggregations derived from the combination for that datapoint.