Historical aggregates - electricitymaps/electricitymaps-contrib GitHub Wiki
This document outlines the methodology that Electricity Maps uses to generate, based on hourly data, aggregates for daily, weekly, monthly or yearly data. The aggregates cover all fields provided at an hourly level by Electricity Maps, e.g., carbon intensity, production and consumption breakdowns, imports and exports, and percentage of carbon-free and renewable electricity.
Temporal averaging
Figure 1: Graph detailing the hierarchy of aggregates
The different aggregates are generated according to a temporal hierarchy. Daily aggregates are generated from hourly values, weekly aggregates are generated from daily aggregates and so on. A threshold is set to ensure that the values used to generate the aggregate for the temporal granularity for the level above are representative. For example, it is required to have at most 34% missing hourly values to generate a daily aggregate.
Aggregation | Max missing threshold (%) |
---|---|
hourly->daily | 34 |
... | ... |
Table 1: Missing thresholds allowed for aggregations
Aggregates that don't respective their set threshold are reported as missing data. Nevertheless, in order to keep any aggregates as faithful to the hourly data as possible, aggregates that don't respect their given threshold will be considered in the computation of the higher level aggregates.
Example 1:
Let's image that for a zone we have the following values for a given week. The 1st and 3rd days have full data, the 2nd day only has one hour of data, and all the rest is just missing.
Datetime | Value |
---|---|
2022-01-01 00:00:00 | 100 |
2022-01-01 01:00:00 | 200 |
... | ... |
2022-01-01 23:00:00 | 125 |
2022-01-02 03:00:00 | 175 |
2022-01-03 00:00:00 | 75 |
2022-01-03 01:00:00 | 100 |
... | .. |
2022-01-03 23:00:00 | 100 |
Table 2: Hourly values for hierarchical temporal averaging
We would then be able to generate the following daily aggregates. Only the 1st and 3rd day would respect the missing values threshold, and thus be served as available daily aggregates.
Datetime | Value | Respects threshold | Number of data points |
---|---|---|---|
2022-01-01 | 122 | True | 24 |
2022-01-02 | 175 | False | 1 |
2022-01-03 | 103 | True | 24 |
Table 3: Daily aggregates from Table 2 for temporal averaging
Finally, when computing the weekly averages, we would still consider the data points for days that don't respect the threshold. As a result, the weekly average would be computed as:
value = (122 * 24 + 175 * 1 + 103 * 24) / (24 + 1+ 24) = 113.8
Weighted averaging
Some values' aggregation relies on weighting respective contributions based on another variable. For example, the carbon intensity hourly figures for a given zone must be weighted by the hourly total consumption. This is explained by the composite nature of the carbon intensity, which is computed as the ratio of total emissions divided by total consumption.
Example 2:
Averaging at a daily level
Datetime | Carbon Intensity (gCO2eq/kWh) | Total Consumption (MWh) |
---|---|---|
2022-01-01 00:00:00 | 100 | 100 |
2022-01-01 01:00:00 | 200 | 150 |
2022-01-01 02:00:00 | 50 | 200 |
Table 4: Hourly values for weighted temporal averaging
Then for these three hours, the aggregated carbon intensity would be:
c = ((100 * 100) + (200 * 150) + (50 * 200)) / (100 + 150 + 200) = 111.1 gCO2eq/kWh
Example 3:
Averaging at a weekly level. Let's now consider the same example as for simple time averaging, but where the values we are aggregating are ratios. A typical example is the carbon intensity of electricity, which is computed as total emissions / total electricity consumption. In this case, any aggregation (daily included) from the original hourly data must be weighted by the total consumption.
Datetime | Value | Weight |
---|---|---|
2022-01-01 00:00:00 | 100 | 39 |
2022-01-01 01:00:00 | 200 | 90 |
... | ... | ... |
2022-01-01 23:00:00 | 125 | 56 |
2022-01-02 03:00:00 | 175 | 49 |
2022-01-03 00:00:00 | 75 | 100 |
2022-01-03 01:00:00 | 100 | 105 |
... | ... | ... |
2022-01-03 23:00:00 | 100 | 45 |
Table 5: Hourly values for hierarchical weighted temporal averaging
As before, and as explained in Example 2, we can first compute daily averages from these values, which will also be flagged as respecting or not the number of missing values threshold.
Datetime | Value | Total weight | Respects threshold |
---|---|---|---|
2022-01-01 | 132 | 1347 | True |
2022-01-02 | 175 | 49 | False |
2022-01-03 | 111 | 1402 | True |
Table 6: Daily aggregates from Table 5 for hierarchical weighted averaging
The values from Table 6 differ from those in Table 3 as the daily aggregates are weighted according to the weight column. Here again, the daily value for 2022-01-02 is accounted for to generate the weekly aggregate, using an average of daily averages, weighted by their respective total weight.
The weekly average would thus be computed as:
value = ((132 * 1347) + (175 * 49) + (111 * 1402)) / (1347 + 49 + 1402) = 122.2
The following provides an overview of all classes of variables whose aggregation requires weighting, with the weights used:
Class of values | Example | Weight used |
---|---|---|
carbon_intensity_*_avg | carbon_intensity_avg | total_consumption_avg |
carbon_intensity_production_avg | carbon_intensity_production_avg | total_production_avg |
power_origin_percent_*_avg | power_origin_percent_renewable_avg | total_consumption_avg |
power_production_percent_*_avg | power_production_percent_renewable_avg | total_production_avg |