Metrics' calculation details - grishasen/proof_of_value GitHub Wiki

Metrics Calculation

1. Supported Metrics and Scores

The CDH Value Dashboard Application supports several categories of metrics, each designed to measure different aspects of customer engagement, conversion, and machine learning model performance.

1.1 Engagement Metrics

Engagement metrics measure user interaction with actions, products or services:

  • Click-Through Rate (CTR): Measures the ratio of users who click on content to the total number of users who view it.

    • Formula:  $CTR = \frac{Positives}{(Positives + Negatives)}$

    • Where "Positives" typically are interactions with outcome "Clicked" and "Negatives" usually are interactions with outcomes "Impression" or "Pending" (without positive response). Those types are configurable and may be different from customer to customer.

  • Lift: Measures the increase in a desired outcome in the NBA group compared to a random model control group.

    • Formula: % CTR difference between different control groups

1.2 Conversion Metrics

Conversion metrics measure the effectiveness of converting user interactions into desired outcomes:

  • Conversion Rate: The percentage of users who take a desired action, such as making a purchase or signing up for a service.

    • Formula: $ConversionRate = \frac{Positives}{(Positives + Negatives)}$

    • Where "Positives" are interactions with outcome "Conversion" and "Negatives" are interactions with outcome "NoConversion" (in case of external atribution model) or "Impression" without positive response (usually when Pega's conversion modeling framework applied). Those types are configurable and may be different from customer to customer.

  • Revenue: Aggregated revenue data from conversion events.

    • Formula: $\sum (Revenue)$ from conversion events

1.3 Customer Lifetime Value (CLV) Metrics

These metrics analyze customer value over time:

  • Frequency: Number of repeat purchases a customer has made (total purchases minus one)

  • Recency: Time period between a customer's first and last purchase (0 for single-purchase customers)

  • Monetary Value: Average value of a customer's repeat purchases (0 for single-purchase customers)

  • Tenure: Duration between a customer's first purchase and the end of the study period

  • Lifetime Value Segment: Calculated customer lifetime segment name

🔄 Overview of the Process

This system processes customer transaction data to calculate Customer Lifetime Value (CLV) metrics. The way the metrics are calculated depends on whether the business model is contractual (like a subscription service) or non-contractual (like a typical online store where customers come and go without formal agreements).

The key inputs used to calculate these metrics include:

  • One-time revenue from purchases

  • Recurring billing amounts

  • The number of recurring billing periods

  • The date of each transaction

Metric Breakdown

1. Frequency

This measures how often a customer makes a purchase or engages in a transaction. It’s counted by looking at the number of purchases or distinct purchase IDs associated with each customer. The idea is to get a sense of how active each customer has been.

2. Monetary Value

This represents the total revenue brought in by a customer.

  • In a non-contractual model, it's simply the sum of one-time purchase amounts. Each purchase adds to the total value for that customer.

  • In a contractual model, it includes both one-time purchase amounts and recurring charges. The recurring revenue is calculated by multiplying the recurring cost per period by the number of periods the customer has been charged. This gives a more complete picture of a customer’s value over time.

3. Recency

Recency tells us how recently the customer made their last purchase. Although this function doesn’t compute the number of days directly, it does keep track of the latest purchase date for each customer.

4. Tenure

Tenure measures how long a customer has been active — from their first recorded purchase to their last. The function identifies both the earliest and most recent purchase dates for each customer. This gives you the total duration of that customer’s purchasing activity.

Contractual vs. Non-Contractual Models

Non-Contractual:

  • Only one-time purchases are considered.

  • Useful for retail, e-commerce, or other “come and go” models.

Contractual:

  • Includes both one-time and recurring revenue.

  • Suitable for subscription-based services where ongoing payments are expected.

📦 Parameters' Role (defined in metrics configuration)

Parameter Description Use
OneTimeCost Revenue from one-time purchases Used for monetary_value calculation
RecurringPeriod Number of periods billed (e.g. months) Multiplied by RecurringCost in contractual model
RecurringCost Cost per billing period Used in contractual CLV to estimate ongoing value
[metrics.clv.model] 'contractual' or 'non-contractual' Determines whether recurring costs are considered
PurchasedDate When purchase occurred Used for recency and tenure metrics
SubjectID Unique customer identifier Used to group transactions for aggregation

1.4 Machine Learning and Recommender System Scores

These metrics evaluate the performance of machine learning models:

  • Area Under the ROC Curve (AUC): Measures the model's ability to distinguish between positive and negative classes.

    • Formula: Calculated using either: 

      • T-digest approach: Uses percentiles to compute TPR/FPR and derive ROC AUC

      • Weighted average approach: Calculates ROC AUC for smallest groups and aggregates as weighted average

  • Average Precision Score: Summarizes the precision-recall curve into a single metric.

    • Formula: Calculated using similar approaches as ROC AUC
  • Personalization: Measures how tailored recommendations are to individual users.

    • Formula: Based on cosine similarity between user recommendation sets. See details.

    • A high score indicates good personalization (users receive different recommendations)

    • A low score indicates poor personalization (users receive similar recommendations)

  • Novelty: Measures how new or unexpected the recommended items are to users.

    • Formula: Based on information theory principles. See details.

    • Calculated as the self-information of recommended items divided by the product of user count and recommendation list length

1.5 Descriptive Metrics

These metrics provide statistical descriptions of the dataset:

  • Count: Number of non-null elements in a column

  • Sum: Sum of values in a column

  • Mean: Average value in a column

  • Median: 50th percentile value using t-digest algorithm

  • p75: 75th percentile using t-digest algorithm

  • p90: 90th percentile using t-digest algorithm

  • p95: 95th percentile using t-digest algorithm

  • Std: Standard deviation (Delta Degrees of Freedom = 1)

  • Var: Variance (Delta Degrees of Freedom = 1)

  • Skew: Bowley's Skewness (Quartile Coefficient of Skewness) 

    • Formula: $Skew = \frac{(Q_3 + Q_1 - 2Q_2)}{(Q_3 - Q_1)}$

    • Where Q1, Q2, Q3 are the 25th, 50th, and 75th percentiles respectively

1.6 Experiment Metrics

These metrics are used for A/B testing analysis:

  • z_score: Z-test (normal approximation) statistics

  • z_p_val: Z-test p-value

  • g_stat: G-test statistics

  • g_p_val: G-test p-value

  • chi2_stat: Chi-square test of homogeneity statistics

  • chi2_p_val: Chi-square test p-value

  • odds_ratio_stat: Sample estimate of contingency table 

    • Formula: $\frac{(table[0,0] * table[1,1])}{(table[0,1] * table[1,0])}$
  • odds_ratio_ci_low/high: Confidence interval of the odds ratio (95% confidence level)

2. Score Calculation Methods

2.1 Engagement Score Calculation

The engagement metrics are calculated using the following process:

  1. Filter interaction history data to include only relevant outcomes (Clicked, Impression, Pending)

  2. Create a binary outcome column (1 for positive outcomes, 0 for negative)

  3. Group data by specified dimensions (e.g., Day, Month, Channel, CustomerType)

  4. Calculate aggregates: 

    • Count: Total number of interactions

    • Positives: Sum of binary outcomes (clicked events)

  5. Calculate Negatives as Count - Positives

  6. Calculate CTR as Positives / Count

2.2 Conversion Score Calculation

The conversion metrics are calculated using a similar process:

  1. Filter interaction history data to include only relevant outcomes (Conversion, NoConversion)

  2. Create a binary outcome column (1 for conversions, 0 for non-conversions)

  3. Group data by specified dimensions (e.g., Day, Month, Channel, ModelType)

  4. Calculate aggregates: 

    • Count: Total number of interactions

    • Positives: Sum of binary outcomes (conversion events)

    • Revenue: Sum of revenue values

  5. Calculate Negatives as Count - Positives

  6. Calculate Conversion Rate as Positives / Count

2.3 Machine Learning Score Calculation

ML metrics are calculated using two possible approaches:

  1. T-digest approach (when use_t_digest = true):

    • Uses t-digest data structure to evaluate percentiles

    • Computes TPR/FPR from percentiles

    • Derives ROC AUC and average precision from these values

  2. Weighted average approach (when use_t_digest = false):

    • Calculates ROC AUC and average precision for smallest possible groups

    • Aggregates as weighted average

For personalization and novelty scores:

  • Personalization is calculated using cosine similarity between user recommendation sets

  • Novelty is calculated using information theory principles

3. Data Aggregation Process

3.1 Initial Data Aggregation

The application processes data from two main sources:

  1. Interaction History (IH) Data:

    • Loaded from Parquet files or other supported formats

    • Filtered based on global filters defined in the configuration

    • Transformed with additional columns as needed

    • Grouped by specified dimensions for metric calculation

  2. Product Holdings Data:

    • Loaded from JSON files or other supported formats

    • Processed to extract purchase dates and other relevant information

    • Used for CLV metric calculations

3.2 Aggregation Process

The data aggregation process follows these steps:

  1. Data Loading:

    • Files are loaded based on patterns specified in the configuration

    • Data is filtered using global filters to improve performance

    • Default values are applied for missing fields

  2. Data Transformation:

    • Additional columns are created as specified in the configuration

    • Date fields are parsed and formatted consistently

    • Binary outcome columns are created for metric calculations

  3. Grouping and Aggregation:

    • Data is grouped by dimensions specified in the metric configuration

    • Aggregation functions are applied to calculate counts, sums, and other statistics

    • Results are stored in memory or cached to DuckDB for efficient retrieval

  4. Metric Calculation:

    • Final metrics are calculated from the aggregated data when report is displayed.

    • Data grouped once again (e.g from Day level to Month level). Exact values for CTR, Conversion Rates, Revenue, ROC AUC, Variance, percentiles, etc. are finally calculated.

    • Results are formatted for display in the dashboard

3.3 T-digest Usage

For certain metrics, the application uses t-digest data structures to efficiently calculate percentiles and other statistics from large datasets. This approach allows for:

  • Memory-efficient storage of distribution information

  • Accurate approximation of percentiles

  • Efficient merging of digest structures for hierarchical aggregation

The t-digest approach is particularly useful for calculating ROC AUC and average precision scores from large datasets, as it avoids the need to store all individual predictions in memory.