validation - nismod/smif GitHub Wiki
smif: Data validation should ease the process of debugging configuration of a model run
Validation should be:
- optional: in debug mode, validation should be enabled, but has the potential to be data intensive if it results in multiple read-writes to the store
- close to the DataInterface: validating the objects returned from the store will reduce the amount of duplicate code written, and ease testing against existing fixtures
- well documented
- tested
- either use the python warnings.warn
system, with warnings derived from
UserWarning
or some other mechanism (e.g. errors) - producing reports for a user to help their debugging
- block a modelrun from being performed if one or more validation warnings are raised
More ideas after discussing with Tom:
smif validate
command could perform a dry run of a model run, checking all configuration, formats of scenario data files, that all data requires meets requirements of models etc.- validation within DataInterface checks types and formats of data dicts and objects returned
- within a model run validation also can cross-check e.g. initial conditions against interventions
Stories
Test Store raises DataNotFound if parameter doesn't exist in model
Labels:
- smif
- validation
ScenarioForm validate that each variant provides all sources
Labels:
- gui
- smif
- validation
Scenarios variants should define sources for all outputs the scenario provides. This currently breaks the datafile handler.
The datahandler should validate that each variant is completed on save.
Dependencies should check absolute range
Labels:
- smif
- validation
It shouldn't be possible to link specs with different absolute ranges - values valid in one spec may be invalid under the other.
Report {error: message} responses in GUI
Labels:
- errors
- smif
- validation
Pass the message through to alert-danger boxes.
Catch SmifException at HTTP API and return {error: message}
Labels:
- errors
- smif
- validation
Possibly use appropriate HTTP codes - 404 if not found.
500 should probably be reserved for unexpected errors (i.e. not SmifException)
Catch SmifException at CLI (top level) and report, exit(-1)
Labels:
- errors
- smif
- validation
Ensure data homogeneity: Add import scripts, ensure read and write of config and data uses one "our" formats
Labels:
- smif
- validation
At present, our datafile interface reads in from a variety of file types, and writes as either csv or binary. It would be cleaner to separate the import of user data from the caching or persistence of configuration and results/model data.
Refactor shape validation into file interface
Labels:
- smif
- validation
Building upon #156734556, move shape validation into datafile interface to validate region definitions upon initialisation (and add a test).
Smif run crashes if unused region_definitions are not present on disk
Labels:
- smif
- validation
Smif run requires all of the region_definitions to be present on the disk, before it is able to run a modelrun. Even when this region_definition is not used to run this particular modelrun.
I think this is unlogical - but this is a question of where we are putting these boundaries.
If missing data for a scenario year, raise DataNotFoundError (or similar clear message)
Labels:
- errors
- smif
- validation
Currently smif raises a DataMismatchError
smif.data_layer.data_interface.DataMismatchError: Number of observations (0) is not equal to intervals (391) x regions (1)
See also #155182033
Raise errors from objects if data is invalid, handle at cli-controller
Labels:
- data_handle
- smif
- validation
Propose design:
- don't validate eagerly if running a model
- let objects complain by raising errors if they are misconfigured
- catch errors at the controller level (e.g. in cli/init), communicating back to the user with clear message, only include stack trace if it contains useful info.
Validate objects saved through HTTP API
Labels:
-
smif
-
validation
-
allow incomplete objects to be saved
-
if incomplete, could return warning (?)
-
if references missing object, return warning or error (?)
-
raise and return error if extra (unexpected) data is posted
-
raise and return error if a data type is unexpected
-
raise and return error if expected size or length is exceeded
smif validate <modelrun_id>
validates a modelrun and all referenced config and data
Labels:
-
smif
-
validation
-
cli method
-
read all linked config, accumulate list of errors (with line in file if possible)
smif validate should validate model config and all sector model configurations
Labels:
- cli
- smif
- validation
smif validate
was removed in smif 0.6
See also #137789573
DataHandle should error if outputs are not provided by a SectorModel
Labels:
- smif
- validation
Since we know the outputs that are declared, and they are all written to results through a DataHandle, we could record what is written and error (after SectorModel.simulate returns) if any output was not recorded.
DataHandle should warn if an input or parameter is not accessed by SectorModel.simulate
Labels:
- smif
- validation
Since we know the parameters and inputs that are declared, and they are all accessed through a DataHandle, we could record what is accessed and warn (after SectorModel.simulate returns) if any input or parameter was not used.
validation: raise warning if SectorModel instance (wrapper) returns outputs, but output.yaml is empty or contradictory
Labels:
- smif
- validation
At present, we do not check that the model outputs specified in output.yaml
match the outputs
actually generated by the SectorModel wrapper.