Model Operationalization Requirements and Preperation - conrad-blucher-institute/semaphore GitHub Wiki

Model Operationalization Requirements

To work with our team in operationalizing AI models, models must adhere to the following requirements.

Model Naming Convention

The naming format should be:

Location_Series_Leadtime.h5

  • Example: Virginia-Key_Water-Level_12hr.h5

This format helps us accurately identify the model’s location, data series, and lead time.

Data Input Requirements

Ascending Order:

  • Input data must be arranged in ascending order for optimal processing and accuracy.

  • Data should also be organized by series. For example, Air Temperature input data should be in the following format:

    Time        | Air Temperature
    ------------------------------
    01:00       | 20°F
    02:00       | 21°F
    03:00       | 22°F
    

Model Accuracy Testing

  • Measured Data Testing: Create a chart comparing your model predictions against measured data. This visual comparison will help verify the model's accuracy and provide a clearer understanding of the expected Semaphore output.

By adhering to these requirements, we can begin the operationalization process.


Model Operationalization Preparation

This section is designed to help researchers prepare for their meeting with our team by outlining the information required to operationalize your model. Please review and provide answers to each of the following areas.

Information Required for Model Operationalization

To ensure a smooth process, the team will need the following details about your model:

  • Author Name
  • Location: The area the model is designed to predict.
  • Lead Time: The time frame the prediction is intended for.
  • Prediction Interval: Specify the time interval required for predictions (e.g., hourly, every 3 hours).
  • Input Data Order: Describe the order in which input data should be fed to the model.

Information Required for Each Model Input

For every input used in your model, please provide the following:

  • Location: The specific location relevant to the input data.
  • API Source: The API from which the data is sourced.
  • Series: The data type (e.g., actual surge, water level, air temperature).
  • Data Range: The range of data relative to the current time (e.g., from 5 hours before "now" to 6 hours after "now").
  • Data Interval: Frequency of the data points (e.g., hourly, every 6 minutes).
  • Data Unit: Units of measurement (e.g., mps, meters, feet).
  • Datum (Optional): Reference level, if applicable (e.g., NAVD).
  • Interpolation Requirements:
    • Is interpolation necessary for the data?
    • If yes, specify the type of interpolation and the maximum limit for interpolation.

Additional Notes

  • Special Processing Requirements: If your model’s input data requires any special processing, please inform us of the specifics.

Providing this information will help our team accurately operationalize your model, ensuring it integrates smoothly and functions reliably.