Multilingual meta data (v5 proposal) - sgpinkus/json-schema GitHub Wiki

THIS WIKI IS OBSOLETE. PLEASE SEE THE NEW JSON-SCHEMA-ORG/JSON-SCHEMA-SPEC REPOSITORY.


NOTE: This proposal has been migrated to https://github.com/json-schema-org/json-schema-spec/issues/53


Proposed keywords

This proposal modifies the existing properties:

  • title
  • description

This proposal would also apply to the proposed enumNames keyword, if that makes it in.

Purpose

This modification would allow inclusions of multiple translated values for the specified properties.

Currently, schemas can only specify meta-data in one language at a time. Different localisations may be requested by the client using the HTTP Accept-Language header, but that requires multiple (largely redundant) requests to get multiple localisations, and is only available over HTTP (not when pre-loading schemas, for instance).

Values

In addition to the current string values (which are presumed to be in the language of the document), the values of these keywords may be an object.

The keys of such an object should be IETF Language Tags, and the values must be strings.

Behaviour

When the value of the keyword is an object, the most appropriate language tag should be selected by the client, and the string value used as the value of the keyword.

Example

{
    "title": {
        "en": "Example schema",
        "de": "..."
    }
}

Concerns

Schemas with many languages could end up quite bulky.

In fact, the Accept-Language option is in many ways more elegant, as the majority of the time only one language will be used by the client (and the other localisations will simply be noise). However, this option is not available in all situations. One might also avoid the extra bulk by using JSON references (and thereby also enable localisation files to contain all translatable text).

An alternative approach to the above would be to reserve localeKey as a property for any schema object or sub-object and localization-strings as a top-level property:

{
    "localization-strings": {
        "en": {
            "example": {
                "title": "Example schema",
                "description": "Example schema description"
            }
        },
        "de": {
            "example": {}
        }
    },
    "type": "object",
    "localeKey": "example"
}

The advantage to this approach would be that, as typically occurs with locale files (for reasons of convenience in independent editing by different translators), all language strings could be stored together. Thus, if leveraging JSON references, it would be a simple matter of:

{
    "localization-strings": {
        "en": {
            "$ref": "locale_en-US.json"
        },
        "de": {
            "$ref": "locale_de.json"
        }
    },
    "type": "object",
    "localeKey": "example"
}

or yet simpler:

{
    "localization-strings": {"$ref": "locales.json"},
    "type": "object",
    "localeKey": "example"
}