Shell Almost Has a JSON Analogue - oilshell/oil GitHub Wiki

Up: Structured Data in Oil

The analogy to JSON is that a serialization format can be defined from a subset of the programming language syntax.

Since shell essentially only has strings, this is just a format that serializes strings in shell.

Let's introduce two requirements:

  • It shouldn't be line-based, because then you can't store a newline in a string. We want stream multiple instances of this format over a pipe.
  • We are also introducing the additional limitation that you don't want to invoke child processes to do it. Otherwise base64 would work. (JSON itself doesn't quite work because it can't represent arbitrary byte strings. See TSV2 Proposal

Summary

Serialize with

printf '%q' "$mystring"

or in bash 4.4

echo ${mystring@Q}

But this doesn't appear to be POSIX? printf is a required shell builtin, but %q isn't mentioned.

https://pubs.opengroup.org/onlinepubs/9699919799/utilities/printf.html

Parse with:

read untrusted_data < $myfile
printf -v myvar '%b' "$untrusted_data"  # This should just create a string an not evaluate arbitrary code.

2024 Update - This does not work, %q and %b are not inverses. %b respects \n like echo -e, but it doesn't respect even basic shell quoting like \'.

printf -v is obviously not

Dynamic assignment is another alternative:

declare a="myvar=$untrusted_data"
declare "$a"

Although I'm not sure this works in bash, because () might be special-cased for arrays. There is probably a hole for code execution, whereas I don't expect that for %b.

Example

It doesn't work in dash, but works in bash and zsh, as expected.

Caveat: NUL character are not representable in shell strings! OSH will fix this. And TSV2 will also be able to represent NUL.

$ echo $'a\x01b' | bash -c 'read x; printf %q "$x"' 
$'a\001b'$ 

$ echo $'a\x01b' | zsh -c 'read x; printf %q "$x"' 
a$'\001'b$ 

$ echo $'a\x01b' | dash -c 'read x; printf %q "$x"' 
dash: 1: printf: %q: invalid directive

Lesson for Oil

Shell needs a serialization format!!!

Update: This is now QSN.