$data (v5 proposal) - sgpinkus/json-schema GitHub Wiki

THIS WIKI IS OBSOLETE. PLEASE SEE THE NEW JSON-SCHEMA-ORG/JSON-SCHEMA-SPEC REPOSITORY.


NOTE: This proposal has been migrated to https://github.com/json-schema-org/json-schema-spec/issues/51


Proposed keywords

  • $data

This keyword would be available:

  • inside any schema
  • contained in an object ({"$data": ...}) for the following schema properties:
    • minimum/maximum
    • exclusiveMinimum/exclusiveMaximum
    • minItems/maxItems,
    • enum
    • more...
  • contained in an object ({"$data": ...}) for the following LDO properties:
    • href
    • rel
    • title
    • mediaType
    • more...

Purpose

This keyword would allow schemas to use values from the data, specified using JSON Pointers or Relative JSON Pointers.

This allows more complex behaviour, including interaction between different parts of the data.

When used inside LDOs, this allows extraction of many more link attributes/parameters from the data.

Values

Wherever it is used, the value of $data is a JSON Pointer or Relative JSON Pointer.

Behaviour

If the $data keyword is defined in a schema, then before any further processing of the schema:

  • The value of $data is interpreted as a JSON Pointer or Relative JSON Pointer.
  • JSON pointers are resolved from the root of the document, while Relative JSON pointers are resolved relative to the current instance being validated/processed/etc.
  • The resolved value is taken to be the value of the schema for all further processing.

When used in one of the permitted schema/LDO properties, then before any further processing of the schema/LDO:

  • The value of $data is interpreted as Relative JSON Pointer.
  • JSON pointers are resolved from the root of the document, while Relative JSON pointers are resolved relative to the current instance being validated/processed/etc.
  • The resolved value is substituted as the property value.

Example

{
    "type": "object",
    "properties": {
        "smaller": {"type": "number"},
        "larger": {
            "type": "number",
            "minimum": {"$data": "1/smaller"},
            "exclusiveMinimum": true
        }
    },
    "required": ["larger", "smaller"]
}

In the above example, the "larger" property must be strictly greater than the "smaller" property.

Concerns

Theoretical purity

Currently, validation is "context-free", meaning that one part of the data has minimal effect on the validation of another part. This has an effect on things like referencing sub-schemas. Changing this is a big issue, and should not be done lightly.

Some interplay of different parts of the data can currently be specified using oneOf (and the proposed switch) - but crucially, these constraints are specified in the schema for a common parent node, meaning that sub-schema referencing is still simple.

The use of $data also (in some cases) limits the amount of static analysis that can be done on schemas, because their behaviour becomes much more data-dependent. However, the expressive power it opens up is quite substantial.

Not available for all keywords

It's also tempting to allow its use for all schema keywords - however, not only is that a bad idea for keywords such as properties/id, but it also might present an obstacle to anybody extending the standard.

Not available inside enum values

It should be noted that while {"enum": {"$data":...}} would extract a list of possible values from the data, {"enum": [{"$data":...}]} would not - it would in fact specify that there is only one valid value: {"$data":...}.

Similar concerns would exist with an extra keyword like constant - what if you want the constant value to be a literal {"$data":...}? However, perhaps constant could be given this data-templating ability, and if you want a literal {"$data":...}, then you can still use enum.

Describing using the meta-schema

The existing mechanics of $ref can be nicely described using a rel="full" link relation.

The mechanics of $data, however, would be impossible to even approach in the meta-schema. We could describe the syntax, but nothing more. Is this a problem?