Cold Stunning General Ensemble (Spring 2025) - conrad-blucher-institute/semaphore GitHub Wiki

Term Definitions

  • model = single H5 file that predicts a value for a given location and a given lead time
  • model family = a group of models for the same location across multiple lead times and the same prediction series (Ex. All temperature predictions for Laguna Madre across different lead times, all water level for Virginia Key across lead times).
  • ensemble lead time slice = a group of predictions for a model where we get multiple predictions for one lead time (range graph)
  • multi lead time ensemble = A group of predictions for the same location across multiple lead times with variable inputs for each lead time (spaghetti graph). (a group of ensemble lead time slices)
  • family input slice = a group of predictions for a model family where we get one prediction per lead time in the group from a single input vector (one line on a time series graph).

General Ensemble Requirements

  • Running a single Cold Stunning model per lead time.
  • Running each model over multiple The Weather Company Predictions.
  • Saving Ensemble Predictions.
  • Graphing Ensemble Predictions in Flare
    • Ribbon Graph (Grouped by lead time)
    • Spaghetti Graph (Grouped by ensemble member)

Separated Requirements

Same model multi run requirements:

  • Be able to run 100 models per lead time [3,6,12,18,24,30,36,42,48,54,60,66,72,78,84,90,96,102,108,114,120]
  • Run models in batches. (Tenserflow loads the model once, and we feed it every vector order and it runs them all at the same.

The Weather Company Ingestion requirements:

  • We need a valid API key (Brian has the API Key)
    • Brian's Query: https://api.weather.com/v3/wx/forecast/probabilistic?format=json&units=e&geocode=LAT,LON&percentiles=temperature:5:95&prototypes=temperature:100&apiKey=A_VALID_API_KEY
  • Should be just one query to The Weather Company

Ensemble Predictions:

  • (soft) Be able to save a prediction by its leadtime & Ensemble Member
  • Group outputs by run groups (grouping by ensemble)
    • Groups should be implicit by understanding an ensemble is run.
    • We don't want to have to define groups in the Database.
  • Think about what query we would need to pull data out of this system. (By Lead time or By Ensemble Member)
  • Should groups contain the same model name? Should a family model name be provided in the DSPEC?

DESPC 3.0:

  • Be able to define multiple input vectors.
  • Be able to define multiple models. (You should reference what model and then a list of input vectors)
    • If we see multi model multi run we can assume this is a group.

Database:

  • Be able to store multiple values for same leadtime and validation time