Folded Block Scalars - yaml/YAML2 GitHub Wiki
Folded Block Scalars should be removed from the language. They offer almost no abilities not offered by the other forms, and yet are hardly ever implemented correctly. Folded scalars have a lot of edge cases.
folded: >
This content is
folded and has trailing newline.
quoted:
"This content is
folded and has trailing newline.\n"
NOTE: We should write some tests to see which implementations get this right.
This:
x: >
foo
bar
baz
Produces:
{"x": "foo bar baz\n"}
This:
x: >
foo
bar
baz
Produces:
{"x": "foo\n bar\nbaz\n"}
That is probably too precise. It uses a wiki-ish syntaxism that doesn't belong in YAML. No emitter would produce it. And no human would remember how it works. So it is not useful.
The only thing that folded offers us over quoted folding, is a trailing newline.
Currently:
x: "foo
"
y: "foo "
produce the same values. There is probably no usage of the first form. So we should make this work:
x: "foo
"
y: "foo\n"
z: >
foo
all produce the same value.
Further, these two are currently the same:
x: "
foo"
y: " foo"
You would never see the first in real life. So we make it:
x: "
foo"
y: "foo"
Then we can make the following work:
x1: "
foo"
y1: "foo"
z1: >-
foo
x2: "
foo
"
y2: "foo\n"
z2: >
foo
This means that when you have a folded paragraph, you don't need to put the first line on the same line as the key (to avoid the extra space).
folded paragraph: "
This means that when you have a folded paragraph, you don't need to put
the first line on the same line as the key (to avoid the extra space)."
Looks great! We have completely obviated any usefulness of the folded scalar form.
Getting rid of the folded form that nobody really understands, will be a good move for YAML, whose detractors think it is too complicated.
Instead of using | to indicate newline preserved block scalar and > for folded block scalar, lets use :: to indicate block mode, then modifiers to change it's behaviour (Default behaviour should be "newline preserved" as that is what most people would expect)
PSUDO-CODE::
IF `::` THEN // work out which block mode
//// Let's avoid too much complexity in parsing logic...
//IF `'` THEN explicitly folded block mode (with automatic indent level detection)
//IF `"` THEN explicitly newline preserved (with automatic indent level detection)
IF ( `"` OR (`\n` then ASCII) ) THEN // detects newline-preserved1
implied/explicitly newline preserved block mode ( indent level autodetected )
IF ( `\n` then `"`) THEN // detects newline-preserved2
explicitly newline preserved block mode ( indent level specified )
IF ( `'` ) THEN // detects folded-block1
implied folded block mode ( indent level autodetected )
IF ( `\n` then `'`) THEN // detects folded-block2
explicitly folded block mode ( indent level specified )
// More experimental proposal
IF ( Not (space or `\n` or number) right after `::` (This is your "boundary string") ) THEN // detects NEWLINEPRESERVED-experimental1
read the word after `::` (e.g. `::frontier` would yeild boundary=frontier ) into boundary variable
// This functions similar to https://en.wikipedia.org/wiki/MIME#Multipart_messages boundary=frontier
explicitly newline preserved, but keep reading at any indent level
Ignore the first line if its matches '::<var boundary>' (It's optional for setting indent level)
(even if it's below parent indent level e.g. indent level 0 )
Keep reading in until a matching '::<var boundary>' number of characters (or more) in it's own line is detected at the right indent level,
Or end of document
( For practicality, it)
IF ( NUMBER ) THEN // detects NEWLINEPRESERVED-experimental2
read in a specific number of lines as specified by NUMBER
good for immutable records. Has speed advantage over the more flexible option above.
ELSEIF `:` THEN
Might be something else! Keep parsing
NEWLINEPRESERVED:
newline-preserved1::
This is the default behaviour
where it will save all newlines
newline-preserved1-alt::"
Same behaviour
as the above
newline-preserved2::
"
This allows for beginning spaces
to be preserved
FOLDEDBLOCK:
folded-block1::'
This is a folded block
might as well treat it like this
as it is easier to deal with
folded-block2::
'
This is also a folded block,
fortunately, since newline is ignored
the same parsing logic will work for both
folded-block1 and folded-block2
NEWLINEPRESERVED-experimental1:
data:text/html::______________________________________________________________
<html>
This is for preserving newlines, where the source is all the way at the bottom
this make it easier to copy paste codes.
Also nicer for QR codes too.
</html>
::____________________________________________________________________________
data:text/html::FRONTIER
::FRONTIER START
<html>
This is for preserving newlines, where the source is all the way at the bottom
this make it easier to copy paste codes.
Also nicer for QR codes too.
</html>
::FRONTIER END
data:text/html::FRONTIER
::FRONTIER START
<html>
This is for preserving newlines, where the source is all the way at the bottom
this make it easier to copy paste codes.
Also nicer for QR codes too.
</html>
::FRONTIER END
otherData: 42
NEWLINEPRESERVED-experimental2:
data:text/html::4
<html>
This is for preserving newlines, where the source is all the way at the bottom
this make it easier to copy paste codes.
</html>
otherData: 42
otherDat2: lol
NEWLINEPRESERVED-experimental3:
data:text/html:::
<html>
This is for preserving newlines, where the source is all the way at the bottom
this make it easier to copy paste codes.
</html>
//END OF DOCUMENT SIGNAL HERE
At first read, your conclusion seems like a nice one. But then that trailing quote strikes me as out of place relative the the first, and I start to wonder did we really gain anything in this trade. I also want to ask how this picture might change if YAML adopts the literal format as the default format.