About Esteemer - Display-Lab/scaffold GitHub Wiki

On Esteemer

Objective:
Outline a modularized functional model of HOW esteemer will accomplish ranking acceptable candidates.

Why:
Currently only gap sizes are being used to rank candidates, and adding additional moderators has proven difficult with the current codebase.

Solution:
A modularized, restructured technical model of Esteemer that focuses on separation of processes and unit testability will therefore likely benefit the team as we continue further development.

The Candidate Dataframe Model

Why use the candidate_df dataframe model?

Advantages: By collating and leveraging one dataframe throughout Esteemer (as opposed to a RDF graph), we can accomplish several goals:

  1. Increased Modularity
  • With this strategy, we separate all of the pre-processing steps that are necessary to enable Esteemer to do what it does without them necessarily needing to be contained within esteemer.py
  • This brings all the benefits of modular code, like unit testing ease and increased concision and clarity for developers
  1. Unit Testability
  • Using a dataframe allows for human-generated test inputs to esteemer.py's constituent elements to be both written and more readily understood as tables vs nodes in a graph
  • Using one persistent df allows for unit testing of the sub-processes that make up esteemer by comparing dataframe input/output expectations
  • Regression testing of esteemer functional units is easily done (with or without automation)
  1. Elimination of dependence on (S,P,O) notation

  2. Leveraging pandas to its full extent

  • Pandas is a powerful tool for dataframe calculations and manipulations, with easy to understand syntax for developers vs. graph operations

Disadvantages:

  • Rewrite of extant codebase is likely, especially for functions like esteemer.score() and parts of get_selected_message() (which prepares selected message for pictoralist use)

The candidate_df Dataframe

The generic candidate_df dataframe has the following shape:
Generic format

Candidate Message Template Measure Comparator Gap Size Trend Measure Recency Message Recency Message Received Count Template Preference
Candidate 1 IRI Candidate Message Template 1 Measure 1 Comparator 1 Gap 1 Trend 1 Msr Rec 1 Msg Rec 1 Msg Rec # 1 Pref 1
... ... ... ... ... ... ... ... ... ...
Candidate n IRI Candidate Message Template n Measure n Comparator n Gap n Trend n Msr Rec n Msg Rec n Msg Rec # n Pref n

Below is a synthetic example candidate_df dataframe which correlates to vignette expectations for Persona 'Alice' (for the measure PONV05 only)
- Feedback generation month is 2023-12-01
- History contains the following:

    "History": {
      "2023-11-01": {
        "message_template_name":"Achieved Top 10% Peer Benchmark",
        "message_generated_datetime": "2023-11-01T1850.426262",
        "measure":"PONV05",
        "message_instance_id":"persona-test-alice-november"
      },
    },
Candidate Message Template Measure Comparator Gap Size Trend Measure Recency Message Recency Message Received Count Template Preference
[Some IRI] no_longer_top_performer PONV05 90th percentile benchmark 5 -6 0 0 0 10
[Some IRI] not_top_performer PONV05 90th percentile benchmark 5 -6 0 0 0 -15
[Some IRI] performance_dropped PONV05 None N/A -6 0 0 0 -22
[Some IRI] getting_worse PONV05 None N/A -6 0 0 0 -22

Getting to the Candidate Dataframe

Step 1: Candidate Processing

Leverging mostly extant code with minimal adaptations, we can use the RDF graph to create the initial portion of candidate_df, containing all acceptable candidates, their comparators, message template names, and measures via URI references on the RDF graph. After this initial processing, candidate_df will have an expected output of:

Generic format

Candidate Message Template Measure Comparator
Candidate 1 IRI Candidate Message Template 1 Measure 1 Comparator 1
... ... ... ...
Candidate n IRI Candidate Message Template n Measure n Comparator n

Example: Alice PONV05

Candidate Message Template Measure Comparator
[Some IRI] no_longer_top_performer PONV05 90th percentile benchmark
[Some IRI] not_top_performer PONV05 90th percentile benchmark
[Some IRI] performance_dropped PONV05 None
[Some IRI] getting_worse PONV05 None

Step 2: Data Component Processing

Again, here we can rework the current code being used to access data available in the RDF graph to build out the data component of the dataframe, using bitstomach annotations from process_spek for instance.

After data processing has been done, ALL acceptable candidates will have an entry in candidate_df, and the business rules on selection criteria can be run across the final dataframe based on the message template name (which correlates to rules the team may make about particular causal pathways). More on this in step 5.

Generic format

Candidate Message Template Measure Comparator Gap Size Trend
Candidate 1 IRI Candidate Message Template 1 Measure 1 Comparator 1 Gap 1 Trend 1
... ... ... ... ... ...
Candidate n IRI Candidate Message Template n Measure n Comparator n Gap n Trend n

Example: Alice PONV05

Candidate Message Template Measure Comparator Gap Size Trend
[Some IRI] no_longer_top_performer PONV05 90th percentile benchmark 5 -6
[Some IRI] not_top_performer PONV05 90th percentile benchmark 5 -6
[Some IRI] performance_dropped PONV05 None N/A -6
[Some IRI] getting_worse PONV05 None N/A -6

Step 3: History Component Processing

Here we can make use of most of the extant code for process_history developed to resolve issues 83 and 195, alongside the extant candidate_df above.

Output from process_history is a dataframe, which looks like:
history_df

Month Template Name Message datetime Measure Message Instance ID
datetime object 'str' 'str' 'str' 'str'

For alice, this would look like:

Month Template Name Message datetime Measure Message Instance ID
2023-11-01 'Achieved Top 10% Peer Benchmark "2023-11-01T1850.426262" "PONV05" "persona-test-alice-november"

Using pandas, the datetime of the month for which feedback is being generated (current_month), candidate_df, and history_df, we can generate lists containing the values of message and measure recency, and past message count for each of the candidates in candidate_df. This functionality is partially developed in the framework of 'process_history_component'.

After Step 3, candidate_df should have the shape below:

Generic format

Candidate Message Template Measure Comparator Gap Size Trend Measure Recency Message Recency Message Received Count
Candidate 1 IRI Candidate Message Template 1 Measure 1 Comparator 1 Gap 1 Trend 1 Msr Rec 1 Msg Rec 1 Msg Rec # 1
... ... ... ... ... ... ... ... ...
Candidate n IRI Candidate Message Template n Measure n Comparator n Gap n Trend n Msr Rec n Msg Rec n Msg Rec # n

Example: Alice PONV05

Candidate Message Template Measure Comparator Gap Size Trend Measure Recency Message Recency Message Received Count
[Some IRI] no_longer_top_performer PONV05 90th percentile benchmark 5 -6 0 0 0
[Some IRI] not_top_performer PONV05 90th percentile benchmark 5 -6 0 0 0
[Some IRI] performance_dropped PONV05 None N/A -6 0 0 0
[Some IRI] getting_worse PONV05 None N/A -6 0 0 0

Step 4: Preference Processing

As this functionality is not built as of yet, below is a psuedo-technical breakdown of how to accomplish preference processing. Ideally, the recipient's preference for one kind of feedback is added to the candidate_df.

With this strategy, the processing logic that is used to turn causal-pathway specific preferences into template-specific preferences can be modularized and used to do the same thing to MPM - turning causal-pathway specific weights into template-specific weights. This allows the MPM to be applied across candidate_df during the ranking step more readily.

Causal-Pathway to Template Specificity Function

The objective is to take values that are determined by the team based on causal-pathways and their motivational models, and transform them into values that apply to their corresponding child message templates.

We can leverage the json formatted dict below to match causal pathways with their children, or use another kind of formatting/strategy:

pathway_template_relations = {
    "goal_approach": {
        "child": "Approach Goal"
    },
    "goal_gain": {
        "child": "Reached Goal"
    },
    "goal_loss": {
        "child": "Drop Below Goal"
    },    
    "improving": {
        "child": "Performance Improving",
        "child": "Congrats Improved Performance"
    },
    "social_approach": {
        "child": "Approach Top 10 Peer Benchmark",
        "child": "Approach Top 25 Peer Benchmark",
        "child": "Approach Peer Average"
    },
    "social_better": {
        "child": "Top Performer",
        "child": "In Top 25%"
    },
    "social_gain": {
        "child": "Achieved top 10% peer benchmark",
        "child": "Achieved top 25% peer benchmark",
        "child": "Achieved peer average"
    },
    "social_loss": {
        "child": "No Longer Top Performer",
        "child": "Drop Below Peer Average"
    },
    "social_worse": {
        "child": "Not Top Performer",
        "child": "Opportunity To Improve Top 10"
    },
    "worsening": {
        "child": "Getting Worse",
        "child": "Performance Dropped"
    },
}

However we aim to accomplish the result, the results should be an expanded MPM broken down by message templates, and preferences broken down by templates. This can be done programmatically to lower the amount of manual maintenance, or could be done by hand and used as raw inputs to the preference processing function. Regardless of origin, these dataframes can then add the preference values for an individual to candidate_df, resulting in:

Generic format

Candidate Message Template Measure Comparator Gap Size Trend Measure Recency Message Recency Message Received Count Template Preference
Candidate 1 IRI Candidate Message Template 1 Measure 1 Comparator 1 Gap 1 Trend 1 Msr Rec 1 Msg Rec 1 Msg Rec # 1 Pref 1
... ... ... ... ... ... ... ... ... ...
Candidate n IRI Candidate Message Template n Measure n Comparator n Gap n Trend n Msr Rec n Msg Rec n Msg Rec # n Pref n

Example: Alice PONV05

Candidate Message Template Measure Comparator Gap Size Trend Measure Recency Message Recency Message Received Count Template Preference
[Some IRI] no_longer_top_performer PONV05 90th percentile benchmark 5 -6 0 0 0 10
[Some IRI] not_top_performer PONV05 90th percentile benchmark 5 -6 0 0 0 -15
[Some IRI] performance_dropped PONV05 None N/A -6 0 0 0 -22
[Some IRI] getting_worse PONV05 None N/A -6 0 0 0 -22

Step 5: Ranking

Finally, the big syntesis step. Let's review the ranking algorithm's mathematical backbone:

$$\text{Performance trend slope, } \Delta_{\text{performance}}$$

$$\text{Performance gap size, } G_{\text{performance}}$$

$$\text{Achievement or loss recency, } t_{\text{event}}$$

$$\text{Feedback history, } t_{\text{measure}}, t_{\text{message}}, \text{ and } N_{\text{received}}$$

$$\text{Individual feedback preferences, } F_{\text{pref}}$$

The overall algorithm can be represented as:

$$F_{\text{pref}} \Biggl[ C_{\text{data}} \biggl( \Bigl( X_s | \Delta_{\text{performance}} | \Bigr) + \Bigl( X_{gs} | G_{\text{performance}} | \Bigr) \biggr) \ + \ C_{\text{history}} \Bigl( \bigl(X_e \cdot t_{\text{event}}\bigr) + \bigl(X_{msr} \cdot t_{\text{measure}}\bigr)+ \bigl(X_{msg} \cdot t_{\text{message}}\bigr) + \bigl(X_N \cdot N_{\text{received}}\bigr) \Bigr) \Biggr]$$

There is an important design discussion to have looking at the above and the below: does achievement/loss recency really factor into the ranking algorithm?

  • Are achievement/loss events moderators, or are they preconditions to the achievement and loss causal pathways?
    • I would posit that they are the latter, and that doing calculations about the time since the last A/L event is not insightful. They are more of a binary, either present or absent for a certain month.
    • When an A/L event has happened in the current month, we absolutely care, but do historic A/L events really act as moderators to the rank of a feedback intervention?
  • With the correct weight setup for message/measure recency, we will already know if an individual recently had feedback about their past achievement or loss, and the ranks of repeated messages about such will inherently be ranked lower than other feedback messages.
    • Is there a case where a monthly feedback delivery should be about an A/L event that did not happen in the current month? I don't believe so.

That said, we can now talk about a proposed mechanism for implementing the ranking algorithm.

Steps 1-4 have set us up with some crucial ingredients, which contain all of the variables needed to do some esteeming. They are:

candidate_df

Candidate Message Template Measure Comparator Gap Size Trend Measure Recency Message Recency Message Received Count Template Preference
Candidate 1 IRI Candidate Message Template 1 Measure 1 Comparator 1 Gap 1 Trend 1 Msr Rec 1 Msg Rec 1 Msg Rec # 1 Pref 1
... ... ... ... ... ... ... ... ... ...
Candidate n IRI Candidate Message Template n Measure n Comparator n Gap n Trend n Msr Rec n Msg Rec n Msg Rec # n Pref n

mpm_templates_df

Message Template X_gap X_trend X_achievement X_loss X_msr_rec X_msg_rec X_msg_count
Approach Goal 0.5 0.8 -.5 0 -.1 -.1 0
... ... ... ... ... ... ... ...
Performance Dropped 0.5 0 0 0 -.1 -.1 -.5

Ranking Code

Combining the two inputs is now straightforward, and esteemer itself or esteemer score() can contain simply the following psuedocode with appropriate implementation:

## Tentative scoring algorithm implementation 
# likely needs more thought regarding the merged_df format and indexing on templates
def score_candidates(candidate_df, mpm_templates_df):
    # Extract necessary columns from candidate_df
    candidate_moderators = candidate_df['Trend', 'Gap Size', 'Measure Recency', 'Message Recency', 'Message Received Count'](/Display-Lab/scaffold/wiki/'Trend',-'Gap-Size',-'Measure-Recency',-'Message-Recency',-'Message-Received-Count')

    # Merge candidate_moderators with mpm_templates_df on 'Message Template'
    merged_df = pd.merge(candidate_df, mpm_templates_df, how='left', left_on='Message Template', right_on='Message Template')

    # Calculate the score using our formula
    rank = (
        merged_df['Template Preference'] * (
            (merged_df['X_gap'] * abs(merged_df['Gap Size'])) + (merged_df['X_trend'] * abs(merged_df['Trend'])) +
            ((merged_df['X_msr_rec'] * merged_df['Measure Recency']) + (merged_df['X_msg_rec'] * merged_df['Message Recency']) + (merged_df['X_msg_count'] * merged_df['Message Received Count']))
        )
    )

    # Update the 'Score' column in candidate_df
    candidate_df['Score'] = rank

# Usage:
score_candidates(candidate_df, mpm_templates_df)
print(candidate_df['Candidate', 'Score'](/Display-Lab/scaffold/wiki/'Candidate',-'Score'))