Getting Caseload & Indicator Data - UN-OCHA/hpc-api GitHub Wiki

Caseloads & Indicators are used throughout plan frameworks in HPC.tools.

Caseloads are metrics relating to populations, such as:
- Total population
- People affected
- People in need
- People targeted by plan
- etc...
Indicators are more general, and can be used to track many other things. For examples of the sort of things that are tracked, visit the Indicator Registry.

They can both be disaggregated in various ways (e.g. by location & demographic), and attached to various points of a plan logframe, namely:

To the plan itself (as overall metrics)
To a specific Sector/Cluster of a plan (e.g. WASH, Health, Nutrition etc...)
To a specific objective/activity (e.g. a Strategic or Specific objective, or a Sector Objective or Activity)

Caveats

The API endpoints presented here are currently the v2 endpoints, which are optimized for data-entry applications like RPM, rather than for consumption by external API users. As a result, they do not currently provide this data in an efficient manner.

This is something that we're in the process of addressing in the design and development of the next version of the HPC API, which you can read more about under the section Upcoming Changes to the API.

This document will be updated once it's possible to retrieve this data using the new API.

If you would like to be notified once this is possible, please see the section Keeping Up-To-Date below.

Our Main API Endpoint

We'll mostly be using a single API endpoint to get the data that we need:

https://api.hpc.tools/v2/public/plan/:planId

This API endpoint does not require any authorization for getting the published plan information.

It requires the ID of the plan we want to get the data for. For details of how to list plans and their IDs, take a look at the guide Getting All Plans for a Given Year.

In addition, we can provide the following query parameters, depending on the data we wish to get:

Query Parameter	Default Value	Details
`content`	`basic`	Set to `basic` or `entities`. When `entities` is included, data relating to clusters/sectors as well as objectives/activities and plan-level attachments will also be included.
`disaggregation`	`true`	Whether or not disaggregated data should be included. When included, this greatly increases the size of the response.

Getting Plan-Level Caseload Data (Excluding Disaggregated)

When getting the cluster information related directly to the plan, we need to set the content query parameter to entities to get all necessary data. However as we're not interested in the disaggregations at this point, we should indicate to the API to exclude them.

Taking the Nigeria 2020 plan as an example, The full GET request would be for the URL:

https://api.hpc.tools/v2/public/plan/1032?content=entities&disaggregation=false

The response looks something like this:

{
  "data": {
    "id": 1032,
    "attachments": [
      {
        "id": 20686,
        "type": "caseLoad", // may also be other types like "cost"
        "attachmentVersion": { // This contains the core data for the caseload attachment
          "hasDisaggregatedData": true,
          "value": {
            "description": "2021 Plan Caseload",
            "metrics": {
              "values": {
                "totals": [
                  {
                    "name": {
                      "en": "Population"
                    },
                    "type": "totalPopulation",
                    "value": 13100000
                  },
                  {
                    "name": {
                      "en": "Affected"
                    },
                    "type": "affected",
                    "value": null
                  },
                  {
                    "name": {
                      "en": "In Need"
                    },
                    "type": "inNeed",
                    "value": 8700000
                  },
                  {
                    "name": {
                      "en": "Targeted"
                    },
                    "type": "target",
                    "value": 6400000
                  },
                  // ...
                ]
              },
              // ...
            },
            // ...
          },
          // ...
        },
        // ...
      },
      // ...
    ]
  }
}

There may be many types of attachments associated with a plan, so you need to filter only the caseload attachments.

There may be more than one caseload associated with a plan (for example in 2020, most plans had a COVID and non-COVID caseload). So depending on your needs, you may want to filter further (e.g. on the description or ID of the caseload), to get the figures you're interested in.

The example JavaScript code below extracts and print's out the key caseload figures, and can be run directly in PostMan:

const response = pm.response.json();

for (const attachment of response.data.attachments) {
    if (attachment.type === "caseLoad") {
        const d = attachment.attachmentVersion;
        const description = d.value.description;
        const totals = d.value.metrics.values.totals;
        const pop = totals.find(t => t.type === 'totalPopulation');
        const affected = totals.find(t => t.type === 'affected');
        const inNeed = totals.find(t => t.type === 'inNeed');
        const target = totals.find(t => t.type === 'target');

        console.log(`Caseload: ${description}`);
        console.log(`Total Population: ${pop?.value}`);
        console.log(`Affected: ${affected?.value}`);
        console.log(`In Need: ${inNeed?.value}`);
        console.log(`Target: ${target?.value}`);
    }
}

This outputs:

Caseload: 2021 Plan Caseload
Total Population: 13100000
Affected: null
In Need: 8700000
Target: 6400000

Getting Plan-Level Caseload Data (Including Disaggregated)

The API request to get all plan caseload data, including disaggregated data, is similar to the above request, but instead of sending disaggregation=false we send disaggregation=true, e.g:

https://api.hpc.tools/v2/public/plan/1032?content=entities&disaggregation=true

This response is usually significantly larger than when not including disaggregated data, and often multiple MB. So only include this parameter if you actually require disaggregated data.

Requests of this form will include a property disaggregated alongside the totals that we use for the high-level numbers.

To interpret this data, please see Interpreting disaggregated Caseload / Indicator Data

Getting Cluster / Sector Caseload Data

For getting cluster or sector caseloads, we use the same request as for plan caseloads, and inspect a different part of the response.

The exact terminology used (i.e. Sector vs Cluster) differs from plan-to-plan, so to remain general, the API refers to these as "Governing Entities".

Extracting caseload associated with governing entities is similar to caseloads attached directly to the plan, but grouped together by each "governing entity".

The part of the response we care about looks something like this:

{
  "data": {
    "id": 1032,
    "governingEntities": [
      {
        "id": 5990,
        "governingEntityVersion": {
          "name": "Camp Coordination and Camp Management",
          // ...
        },
        "attachments": [
          {
            "id": 21962,
            "type": "caseLoad", // may also be other types like "cost"
            "attachmentVersion": { // This contains the core data for the caseload attachment
              "hasDisaggregatedData": true,
              "value": {
                "description": "2021 Plan Caseload",
                "metrics": {
                  "values": {
                    "totals": [
                      {
                        "name": {
                          "en": "In Need"
                        },
                        "type": "inNeed",
                        "value": 1921903
                      },
                      {
                        "name": {
                          "en": "Targeted"
                        },
                        "type": "target",
                        "value": 1438157
                      },
                      // ...
                    ]
                  },
                  // ...
                },
                // ...
              },
              // ...
            },
            // ...
          },
          // ...
        ],
        // ...
      }
      // ...
    ],
  }
}

Like at the plan level, multiple caseloads can be added to each governing entity, so you will likely need to check that the exact caseload you are interested in is used when multiple caseloads are returned.

The example JavaScript code below extracts and print's out the key caseload figures for each cluster, and can be run directly in PostMan:

const response = pm.response.json();

for (const ge of response.data.governingEntities) {

    const geName = ge.governingEntityVersion.name;

    for (const attachment of ge.attachments) {
        if (attachment.type === "caseLoad") {
            const d = attachment.attachmentVersion;
            const description = d.value.description;
            const totals = d.value.metrics.values.totals;
            const inNeed = totals.find(t => t.type === 'inNeed');
            const target = totals.find(t => t.type === 'target');

            console.log(`Caseload for ${geName}: ${description}`);
            console.log(`  In Need: ${inNeed?.value}`);
            console.log(`  Target: ${target?.value}`);
        }
    }
}

Which prints out the following:

Caseload for Camp Coordination and Camp Management: overall
  In Need: 1921903
  Target: 1438157
Caseload for Early Recovery and Livelihoods: Overall
  In Need: 2134573
  Target: 260232
Caseload for Education: overall
  In Need: 1262980
  Target: 1026541
Caseload for Emergency Shelter and NFI: overall
  In Need: 2272708
  Target: 1402448
Caseload for Food Security: undefined
  In Need: 5138357
  Target: 4270888
Caseload for Health: overall
  In Need: 5813335
  Target: 5262494
Caseload for Nutrition: overall
  In Need: 1531865
  Target: 1292540
Caseload for Protection: 
  In Need: 4076551
  Target: 2471984
Caseload for Water and Sanitation: Overall
  In Need: 2881346
  Target: 2523340

If you have also requested disaggregated data in your response, then relevant attachments will include a property disaggregated alongside the totals that we use for the high-level numbers above.

To interpret this data, please see Interpreting disaggregated Caseload / Indicator Data

Getting Objective / Activity Caseload or Indicator Data

TODO

Interpreting disaggregated Caseload / Indicator Data

If you have a request that includes disaggregated data (e.g. you have included disaggregation=true in one of the requests above), then interpreting the data is a bit more involved.

Unfortunately the API does not currently provide ways to request subsections of a disaggregated caseload / indicator (such as only showing location totals, or certain categories), which unfortunately means that you need to understand the full data-structure to get the data you need.

Data is disaggregated using the following dimensions:

Location
Category (e.g. age, gender, IDP, etc...)

(exactly which categories are available depends on the configuration of the respective plan)
Metric (e.g. Population, Affected, In Need, etc...)

To reduce the size of the response, and storage requirements, disaggregated data is stored in a 2D array (i.e. a table), and needs to be cross-referenced with the appropriate locations, categories and metrics that represent each row / column.

The disaggregated data looks roughly like so:

{
  "disaggregated": {
    "locations": [
      {
          "id": 25818257,
          "name": "Demsa",
          "parent": {
              "id": 25818203,
              "name": "Adamawa"
          }
      },
      //...
    ],
    "categories": [
      {
        "ids": [ /* ... */ ],
        "name": "Combination-idp-<1-Girls",
        "label": "IDP - <1 - Girls",
        "metrics": [
            {
                "name": {
                    "en": "Population"
                },
                "type": "totalPopulation"
            },
            // ...
        ]
      },
      // ...
    ],
    "dataMatrix": [
      [54, 2452, 12, /* ... */ ],
      [54, 2452, 12, /* ... */ ],
      // ...
    ]
  }
}

Each row represents a location, and each column a category + metric combination, the order of the rows and columns match the order of the locations, categories and metrics in the arrays above.

The first row always represents the overall location. The last group of columns represents the totals for each location (using the same metrics as the totals value for the attachment).

For example, if we had the following locations, categories, and metrics:

Total Metrics: M1, M2, M3
Locations: LA, LB, LC
Categories:
- C1. With Metrics M1, M2
- C2. With Metrics M1
- C3. With Metrics M1, M2, M3

Then the dataMatrix (the 2D array) would look something like this:

However, in most cases, the exact same metrics are used across all categories, so we can simply check that that's the case, and simplify our code.

Interpreting disaggregated data in PowerQuery

If you are interested in interpreting disaggregated data using PowerQuery, we have created a PowerQuery function that you can use in your own query to simplify the process: ProcessDisaggregatedAttachment

JavaScript Example 1 (Location Metrics)

The example JavaScript code below extracts and print's out the key caseload figures for every location, and can be run directly in PostMan:

const response = pm.response.json();

for (const attachment of response.data.attachments) {
    if (attachment.type === "caseLoad") {
        const d = attachment.attachmentVersion;
        const description = d.value.description;
        const v = d.value.metrics.values;
        const locations = v.disaggregated.locations;
        const categories = v.disaggregated.categories;
        const matrix = v.disaggregated.dataMatrix;

        console.log(`Caseload: ${description}`);  

        // Check that the number of locations matches the number of rows
        if (locations.length + 1 !== matrix.length) {
            throw new Error('Unexpected number of rows')
        }

        // Check that the same metrics are used for every category
        for (const c of categories) {
            if (c.metrics.length !== v.totals.length) {
                throw new Error('Inconsistent category metrics unsupported');
            }
        }

        // Ensure that the number of columns is as expected for every row
        const expectedColumns = (categories.length + 1) * v.totals.length
        for (const row of matrix) {
            if (row.length !== expectedColumns) {
                throw new Error('Unexpected number of columns');
            }
        }

        // Calculate the indexes for each metric type that we're interested in
        const popIndex = v.totals.findIndex(t => t.type === 'totalPopulation');
        const affectedIndex = v.totals.findIndex(t => t.type === 'affected');
        const inNeedIndex = v.totals.findIndex(t => t.type === 'inNeed');
        const targetIndex = v.totals.findIndex(t => t.type === 'target');

        // For each location, print out the totals
        for (let li = 0; li < locations.length; li++) {
            const loc = locations[li];
            const row = li + 1;
            // We want the last "group" of columns (after each of the categories)
            const columnOffset = categories.length * v.totals.length;
            console.log(`  Location: ${loc.name} (${loc.id})`);
            console.log(`    Total Population: ${matrix[row][columnOffset + popIndex]}`);
            console.log(`    Affected: ${matrix[row][columnOffset + affectedIndex]}`);
            console.log(`    In Need: ${matrix[row][columnOffset + inNeedIndex]}`);
            console.log(`    Target: ${matrix[row][columnOffset + targetIndex]}`);
        }

    }
}

Which prints out the following for the URL https://api.hpc.tools/v2/public/plan/1032?content=entities&disaggregation=true (which represents the Nigeria 2021 plan):

Caseload: 2021 Plan Caseload
  Location: Demsa (25818257)
    Total Population: 150002
    Affected: null
    In Need: 103565
    Target: 74713
  Location: Fufore (25818256)
    Total Population: 240583
    Affected: null
    In Need: 168239
    Target: 87848
etc...

JavaScript Example 2 (Location & Category Metrics)

The example JavaScript code below extracts and print's out the key caseload figures for every location and category combination, and can be run directly in PostMan:

const response = pm.response.json();

for (const attachment of response.data.attachments) {
    if (attachment.type === "caseLoad") {
        const d = attachment.attachmentVersion;
        const description = d.value.description;
        const v = d.value.metrics.values;
        const locations = v.disaggregated.locations;
        const categories = v.disaggregated.categories;
        const matrix = v.disaggregated.dataMatrix;

        console.log(`Caseload: ${description}`);  

        // Check that the number of locations matches the number of rows
        if (locations.length + 1 !== matrix.length) {
            throw new Error('Unexpected number of rows')
        }

        // Check that the same metrics are used for every category
        for (const c of categories) {
            if (c.metrics.length !== v.totals.length) {
                throw new Error('Inconsistent category metrics unsupported');
            }
        }

        // Ensure that the number of columns is as expected for every row
        const expectedColumns = (categories.length + 1) * v.totals.length
        for (const row of matrix) {
            if (row.length !== expectedColumns) {
                throw new Error('Unexpected number of columns');
            }
        }

        // Calculate the indexes for each metric type that we're interested in
        const popIndex = v.totals.findIndex(t => t.type === 'totalPopulation');
        const affectedIndex = v.totals.findIndex(t => t.type === 'affected');
        const inNeedIndex = v.totals.findIndex(t => t.type === 'inNeed');
        const targetIndex = v.totals.findIndex(t => t.type === 'target');

        // For each location and category, print out the totals
        for (let li = 0; li < locations.length; li++) {
            for (let ci = 0; ci < categories.length; ci++) {
                const loc = locations[li];
                const cat = categories[ci];
                const row = li + 1;
                const columnOffset = ci * v.totals.length;
                console.log(`  Location: ${loc.name} (${loc.id})`);
                console.log(`  Category: ${cat.label})`);
                console.log(`    Total Population: ${matrix[row][columnOffset + popIndex]}`);
                console.log(`    Affected: ${matrix[row][columnOffset + affectedIndex]}`);
                console.log(`    In Need: ${matrix[row][columnOffset + inNeedIndex]}`);
                console.log(`    Target: ${matrix[row][columnOffset + targetIndex]}`);
            }
        }

    }
}

Which prints out the following for the URL https://api.hpc.tools/v2/public/plan/1032?content=entities&disaggregation=true (which represents the Nigeria 2021 plan):

Caseload: 2021 Plan Caseload
  Location: Demsa (25818257)
  Category: IDP - <1 - Girls)
    Total Population: 349
    Affected: 
    In Need: 248
    Target: 481
  Location: Demsa (25818257)
  Category: IDP - <1 - Boys)
    Total Population: 308
    Affected: 
    In Need: 111
    Target: 275
etc...