EEF - hpgDesigns/hpgdesigns-dev.io GitHub Wiki
ENIGMA's Extensible Enumeration Format (EEF) is a format created to allow plain-text storage of a variety of data in a linear fashion, without compromising the ability to extend the format later without invalidating outdated readers.
Specification
Structure
The format is based on newlines with length-prefixing. Information stored in this format is organized into Data Blocks, which assign definite length patterns.
Comments
EEF comments are defined to commence with any single consistent
combination of the following symbols: # ; % / '
, and to be
terminated by the end of the line. A file defines what comment
combination it will be using by starting the file with those symbols as
the very first symbols in the file, or if none are specified, it is
assumed the file has no comments. The remainder of the file must
consistently use that comment combination. This is to allow for the
storage of multi-lingual code-snippets without alienating generic EEF
readers as a parse option.
A valid comment symbol combination can be as simple as ;
(as in
Assembly), #
(as in bash) or //
(as in C), or can be as complex as
/%#%/
(for the express purpose of not conflicting with the
aforementioned languages), however, remember, in the latter case, the
comment begins with the full /%#%/
and ends with a newline. Multi-line
comments are not available - instead, simply precede each comment line
with the comment sequence.
As of version one of the EEF specification, only one comment symbol pattern can be defined per file. This restriction may be revised later.
After the first non-comment line in the file, no further lines can be left entirely commented unless said lines would be otherwise blank. Hence, these lines will be exclusively attribute lines: a comment cannot appear as the complete line anywhere that a reader would expect to find a Data Block indicator. This is done to promote ease of skipping unknown Data Blocks.
Quotes
The EEF specification defines quotes for use in Small
Attribute listing. The double quote
symbol ("
) or single quote symbol ('
) may be used except where the
latter is defined as a Comment pattern. The
escape sequences \\
, \"
and \'
are defined; all others are to be
handled application-side.
Indentation
Indentation is optional, and should be done as-desired to make the format visually pleasing and computationally simple. Generally, where unimportant, these indentation characters will be discarded by readers. Many indentation characters are supported, including tab ('\t'), space (' '), and any combination thereof.
Data Blocks
Similar to binary data chunks in the PNG specification, EEF Data Blocks are multi-line sections of text defined to contain at most one of the following:
- A set of {N} subsequent Data Entries
- A set of [N] Attribute Lines.
Data Block Indicator
These blocks are set off by a single line which names the Data Type. The format for a data block indicator is as follows:
Type`` ``Name
(plural)
{
Number of contained
Data`` ``Blocks
}
The data block indicator may appear on its own at the root of the file or after any Data Entry Indicators
Data Entries
Data entries are typically members of data blocks, but are permitted to appear at the root of the file if no higher level is required. Data entries define an instance of their Data
Data Entry Indicator
Data entry indicators
Type`` ``Name
(
primary`` ``id
)
:
Small`` ``Attributes
or
[
Number of additional
Line`` ``Attributes
]
Attributes
EEF specifies two methods of associating attributes with each Data Block. Small data is typically stored in Small Attributes, while larger snippets of data are better suited to storage in Line Attributes
Small Attributes
Small Attributes are those which appear after a Data Entry Indicator; they can be completely scalar, or can be given a value using a pair of parentheses. For example, take this code:
Examples{1}
Example: "yellow" Shape(Round)
Our Example entry is declared with two attributes. The first is the valueless attribute "Yellow", which speaks for itself. The second is a Shape attribute, which has been set to the value "Round". Quotes may appear around any attribute, and are completely optional except where otherwise ambiguous (for instance, the attribute "lime green" must be quoted even though "yellow" may be left alone). Since attribute values are surrounded by parentheses, quotes are generally unnecessary in them.
Line Attributes
Line Attributes follow any Data Entries which define a non-zero count inside a pair of square brackets[]. Line Attributes are Attributes which are given an entire line to themselves. They may, semantically, instead be a single very large attribute which spans multiple lines, but this is up to the application to determine; this spec only mandates that the number of lines used is given up front for the purposes of skipping or isolating.
Type Name
The name of the members of the Data Block.
When used in the Data Block
Indicator, it should be pluralized.
When used in a Data Entry Indicator,
it should be singular. The difference between the singular and plural
form shall be implementation-specific. By convention, the reader must
warn if the plural string doesn't match the singular string using
according to some regular expression generated from the singular string.
This generated regular expression can be anything from
/
singular
e?s?/
to /.*/
. Since this requirement is purely
conventional, no check must actually be made by the implementation for
the latter expression.
More advanced plurality checks can be accomplished loosely by starting
with the expression /
singular
e?s?/
, and applying any expressions
in the following table to the singular
string:
Find | Replace with | To support |
---|---|---|
y |
[yi] |
Property, Entry, Assembly... |
f |
[fv] |
Leaf, Hoof, Elf, Self, Knife... |
ex |
(ex|ic) |
Vertex, Index |
ix |
(ix|ic) |
Matrix, Appendix |
ous |
(ous|ic) |
Mouse, Louse |
oo |
(oo|ee) |
Goose, Tooth, Feet |
on$ |
(on|a) |
Criterion, Phenomenon |
um$ |
(um|a) |
Bacterium, Curriculum, Medium, Datum |
is$ |
(is)? |
Axis, Analysis, Thesis |
us$ |
(us|i|.ra) |
Cactus, Fungus, Octopus, Alumnus, Succubus... Genus, Corpus, Viscus |
an$ |
[ae]n |
Man, Woman |
$ |
(r?en)? |
Children, Oxen |
$ |
x? |
Beau, Bureau, Tableau |
There only plural not covered by the entirety of the above table is "person" -> "people".
Primary ID
Each Data Entry should give itself a name or ID, or some primary data affiliated with that particular entry. It does not necessarily have to be unique, or even scalar. The primary ID is simply a place to put the most important information about the entry.
Root
Any line which lies outside the scope of lines claimed by any of the above is considered to be at the root of the file.
Length-prefixing
In other extensible formats, such as PNG, length-prefixed chunks and chunk-IDs (or similar devices) are used to allow various readers to ignore sets of data they are incapable of reading. In EEF, different brackets are used to indicate various lengths.
- Curly braces {} are used to indicate a number of repetitions of Data Blocks. The data that is repeated can be multiple lines, and the number of additional lines can vary. See the next point.
- Square brackets [] are used in Data Entries to indicate that a fixed number of lines follows which describe or belong to the current data block.
Example File
## Sample EEF file
Items{3} ## The {} tells us we'll be reading 3 fields of varying size before this block is over.
Item("Sword"): "Weapon" Damage(4) "Magic" Text[2] ## Note the use of [] to indicate lines we can skip if we don't know what this is.
A magical sword
Given to you by an elf or something.
Item("Pen"): "Weapon" Damage(.5) Text[2]
An ordinary pen
Just a regular pen. Doesn't do much damage, but it's still mightier than the sword, right?
Item("Apple"): "Food" Health(10) Text[2]
A delicious apple
Probably came from a tree, or something. Who knows. Restores 10HP and doesn't afraid of anything.
Collectables{1} ## Now that all three Items have been defined, we can end the file or declare new. We'll do the latter.
Collectable("Watch fob"): Text[2]
A beautiful golden watch fob
It has absolutely no use.