Forecast Checks - reichlab/covid19-forecast-hub GitHub Wiki

Note: As of February 20, 2023 we are no longer collecting data or analyzing COVID-19 cases and as of March 6, 2023 we are no longer collecting data or analyzing COVID-19 deaths.

header must minimarlly include location, target, type, quantile, value (required for zoltpy) and forecast_date, target_end_date
each row must have the same number of columns as header
location must be in "locations" column of locations.csv

target must be in

paste(1:20,  "wk ahead inc death")
paste(1:20,  "wk ahead cum death")
paste(0:130, "day ahead inc hosp")
paste(1:8, "wk ahead inc case")

county locations should have only "case" targets

forecast_date and target_end_date must be in YYYY-MM-DD format. Additionally, forecast_date should be within ±1 day of the date mentioned in the forecast filename. E.g. - A file in data-processed/model/2021-04-12-model.csv should have forecast_date within 2021-04-11 - 2021-04-13.
the set of quantiles for targets other than cases must include this entire set of quantiles
```
c(0.01, 0.025, seq(0.05, 0.95, by = 0.05), 0.975, 0.99)
```
the set of quantiles for "case" targets must include this entire set of quantiles
```
c(0.025, 0.100, 0.250, 0.500, 0.750, 0.900, 0.975)
```
checks quantile must be an int or float in [0, 1]
checks value must be an int or float and non-negative, except for retractions as detailed below
- Forecast retractions: If you want to retract some existing forecast rows in a file, you can do so by specifying NULL (no quote marks), not NA, None, or anything else. More details are mentioned here.
validates date alignment as documented in the issue add additional validations
validates quantiles and values (i.e., at the prediction level):
- checks that entries in value must be non-decreasing as quantiles increase
- checks that elements in the quantile are unique
validates quantiles as a group:
- there must be zero or one point prediction for each location/target pair
Validates if the prediction value for a location is at least less than that location's population.
- this check is run for all forecast submissions for all targets (in/cum deaths/cases).
- the population truth data is present in the locations.csv file.
- To check which predictions are violating, check the logs in the Github Actions build of your PR and the invalid predictions should be printed there.