eYAML - yaml/YAML2 GitHub Wiki

One of the most popular serialization standards is JSON. This is since it is an RFC standard that is widely used within internet services. Also JSON parsers is widely supported for many languages. A reason for it's popularity, is the simplicity of the syntax, and the speed and size of the parsers that reads it.

These properties of JSON, makes it widely popular as an embedded parser for human readable settings.

However YAML while being more readable, has problem with adoption as an embedded parser. This is mostly due to the complexity of YAML standard, verse JSON. While YAML parsers that adhere to the standard can support all JSON types, it also includes many unnecessary advance features that is not required in most use cases.

The lack of an embedded YAML profile, means that while the YAML specification is extremely powerful compared to JSON. It is not as flexible for adoption by programmers as a simple configuration file reader, and thus resort to json as the configuration parser. This is undesirable, since they would miss out on the simplicity of YAML.

Thus while we may have "Standard YAML", we should also extend a child standard/recommendations specifically aimed towards embedded systems and call it "embedded YAML" (Aka eYAML).

embedded YAML (eYAML) - What is it?

Embedded YAML is a barest of minimum YAML parser, that supports at the minimum the same datatypes that also exist within JSON, with very few extra. To keep parser small, the "one true way" philosophy encouraged for this (e.g. boolean is true|false only. Quoted/Unquoted string should still both be supported, since it is painful to keep to one) .

Think about the difference between Lua and Python.

For datatypes that YAML doesn't recognise, it should skip the entry and notify the parent program of an unknown entry in the YAML document.

Such parser if it has the ability to convert to json (shouldn't have to be mandatory however, but should be encouraged), should follow the recommendation in this page about converstion from YAML to JSON, and allow for two modes. One for lossless conversion of datatypes from YAML to json, and another for cleaner more compact (at the cost of stripping out comments, and unknown datatypes).

tl;dr: smaller parser that covers the most important 20% of use cases

Let's aim to get this one, no more than 2 or 3 pages longer than the JSON spec. This needs to be easy to develop for, to be the lowest common denominator YAML.

core YAML (cYAML)

cYAML is a stripped down YAML parser, that supports at the minimum the same datatypes that also exist within JSON.

Unlike eYAML which emphasis on the most minimum syntax, core YAML covers the majority of use cases. But should be extendable to include as much advance YAML features as required, via a plugin system.

E.g. In http://yaml.org/type/bool.html

  • eYAML : regexp "true|false"
  • cYAML : regexp "y|Y|yes|Yes|YES|n|N|no|No|NO|true|True|TRUE|false|False|FALSE|on|On|ON|off|Off|OFF"

tl;dr: parser that covers the most important 80% of use cases.

standard YAML (sYAML)

Well... this is just the normal yaml specification. Just with a nicer branding name.

every other profile is a subset of this one

tl;dr: complete parser that covers the 100% of use cases

Any other profiles?

Embedded YAML (eYAML) is the lowest common denominator for parsing extremely simple YAML files.

But perhaps there should be other profiles that exist between eYAML and the full standard YAML.

For now... lets just stick to eYAML, and add more to the mix later.