Internals: the interpreter - troyp/jq GitHub Wiki

The jq interpreter is stack-based.

Expressions mostly consist of either builtin opcodes (or sequences thereof) or calls to jq functions. The interpreter also implements generators and backtracking. This page mostly describes the function call machinery at this point.

Important things to understand about the jq language before deep-diving into the interpreter:

  • all function arguments are closures
  • all closures are lexical closures of dynamic extent
  • closure references cannot be stored anywhere
  • closure references can only be used to either a) call them, or b) construct new closures for other function calls

For example, foo(bar) is a function call with one argument closure (bar). The callee (foo) will get the same closure value as its argument as bar resolved to in the caller.

Whereas in foo(bar + 1) the passed closure corresponds to the bar + 1, which in turn captures bar.

Each closure is represented as a pair of (closed frame reference, bytecode). In the bytecode closed frame references are relative to the current frame (working backwards from the current frame via its env pointer). The interpreter converts those relative frame references to stack addresses at function call time (i.e., in the handling of the CALL_JQ instruction.

There are conventions for pushing different kinds of things on the stack. These conventions allow the jq interpreter to unwind and cleanup without having to execute all return paths.

A function call in the bytecode consists of a CALL_JQ opcode as well as nclosures pairs of relative frame reference and index to either a code or closure reference in the closed frame (see the block comments about this in execute.c). If the index is of a closure in the closed frame then that closure is used.