Metaprogramming - oilshell/oil GitHub Wiki

Papers:


Idea: Use "Lisp-like AST metaprogramming, but with Syntax".

  • Scheme: ,foo is (unquote foo), ,@foo is unquote-splicing, `(a b) is quasiquote

  • Clojure: ~foo is (unquote foo)

  • Python-like languages with metaprogramming (TODO: transcribe examples)

  • Julia --

  • Elixir metaprogramming uses quote/end for quotation, and unquote for interpolation. (In fact the entire Elixir language appears to be done with AST metaprogramming, since it's on top of Erlang.)

    • defmacro for macros, arguments unevaluated
    • Monkey language macro system is based on Elixir's system:
      • https://interpreterbook.com/lost/
      • quote, unquote, macro() { }, and mymacro()
      • macro invocation isn't distinguished from function call -- walk the AST and use a Go dynamic type check for the object.Macro type
      • requires an AST walker because you have to do multiple walks:
        • search for unquote() within unevaluated AST subtrees
        • define macros
        • expand macros
      • difficulties
      • limitations:
        • what about lexical modification? like c=n; echo -e "\$$c". Need eval(string, ctx) too?
        • statements vs. expressions: Currently, we only allow passing expressions to quote and unquote. One consequence of that is that we can't use a return statement or a let statement as an argument in a quote() call, for example. The parser won't let us, simply because arguments in call expression can only be of type ast.Expression.
  • scalameta.org -- q"" for quotation, $var for interpolation.

  • R metaprogramming -- everything is quoted implicitly because it's lazily evaluated

    • substitute, deparse, eval, etc. Need to look at examples.
    • R is unique in that has lazy evaluation? So everything can be metaprogrammed before using it?
      • macros are just functions? what about scope?
    • Non-standard evaluation by Hadley Wickham
    • Oil and the R Language
    • Programming with dplyr
      • quo() returns a quosure, which is a special type of formula
      • enquo() uses some dark magic to look at the argument, see what the user typed, and return that value as a quosure.
      • If you’re familiar with quote() and substitute() in base R, quo() is equivalent to quote() and enquo() is equivalent to substitute().)
      • we quote the variable with quo(), then unquoting it in the dplyr call with !!. Notice that we can unquote anywhere inside a complicated expression.
      • Use quos() to capture all the ... as a list of formulas.
      • Use !!! instead of !! to splice the arguments into group_by().
      • Automatic quoting makes dplyr very convenient for interactive use. But if you want to program with dplyr, you need some way to refer to variables indirectly. The solution to this problem is quasiquotation, which allows you to evaluate directly inside an expression that is otherwise quoted.
      • The first important operation is the basic unquote, which comes in a functional form, UQ(), and as syntactic-sugar, !!
      • Its functional form is UQS() and the syntactic shortcut is !!!
      • The final unquote operation is setting argument names. You’ve seen one way to do that above, but you can also use the definition operator := instead of =. := supports unquoting on both the LHS and the RHS.
    • Tidy evaluation, most common actions
    • Non-standard evaluation, how tidy eval builds on base R
  • C++ Proposal led by Herb Sutter: Metaprogramming in C++ https://www.youtube.com/watch?v=4AfRAVcThyA&t=1649s

    • syntax (this is a proposal, so syntax may change):
      • constexpr { } blocks for things that must be evaluated at compile time.
      • -> { } blocks for runtime code
      • $ syntax for compile time variables. For types only, not expressions or statements?
    • Example: getting string names from an enum. Very relevant to lexing/parsing! Many languages have a tiny code generator for tokens and AST nodes.
    • He is selling it pretty hard, saying "this is already what we do", "we're not turning C++ into Lisp", etc.
    • "constexpr all the things" -- e.g. STL algorithms and data types
  • Clang AST

I saw a video where people asked why Clang source tools generate textual changes rather than AST changes... and this is a good example. People for some reason think that ASTs are "cleaner" or more usable, but they can be a pain.

https://news.ycombinator.com/item?id=13630134

Compile-Time Metaprogramming in Systems Languages

Comments on "Outperforming everything with anything Python? Sure, why not?"

Types of Metaprogramming

  • before lexer -- code generation
  • before parser -- not sure this exists? Generate tokens? Yes this is how the C preprocessor works! It has roughly the same lexer as C, but a different parser!
  • before compiler -- AST metaprogramming.
  • at runtime, after compiler -- reflection.

Oil Design

  • Do template-like metaprogramming with auto-escaping? That means you need to lex languages rather than parse them? You can do this with HTML, but I'm not sure about other languages.
  • Philosophy: Oil is about metaprogramming other languages (primarily), not metaprogramming itself!
  • But we do need a syntax for lazy evaluation, for R-like expressions. (I don't think we need statements).
    • I think this can just be quotation and interpolation. AST nodes can be opaque/immutable.
    • syntax: filter(df, \uri == uri), or filter(df, \(uri == $$uri). $$ or % could be interpolation.
  • And we do have eval() -- for feature detection, at the very least.

Language Pairs

  • Lua/Terra -- one is dynamic and one is static.
  • C preprocessor and C (same "lexical context")
  • "C with classes" and C++ template metaprogramming -- two different languages
    • new "Meta" proposal: C++ and C++
  • Oil implementation: Python + C (and C++), via textual code generation.
  • Python and TensorFlow -- both are dynamic?
    • C++ and C++ template metaprogramming -- Eigen
  • Scala?
  • OCaml and MetaOCaml -- both are static? I think it's not possible to generate a program that doesn't pass the "normal" OCaml type checker?
  • Python and RPython -- The hard to understand part is that Python is a meta-programming language for RPython

Links

Notes on Converge Paper: Compile-time metaprogramming in a dynamically typed OO language

  • based on Template Haskell
  • lexical vs. syntactic (AST) macros
    • lexical macros: you need an entirely new language!
    • Lisp macro systems require the compiler to recognize macros as different than functions
    • languages such as TemplateHaskell distinguish only the macro call itself. macros can be any function in the host language.
  • Lisp's syntactic minimalism lends itself to metaprgramming
  • Converge language
    • similar to OPy -- statically analyzable namespaces! I think this is to distinguish compile-time vs. runtime variables.
    • has offline compilation and linking step too!
    • uses func keyword
    • no global scope, just local
  • section on scoping rules, for compile time and runtime variables
  • nice section on error reporting! One of the most significant unresolved problems.
  • use cases:
    • conditional compilation
    • runtime compilation of printf (similar to what Python's f-strings now do)
  • user experience: compile-time metaprogramming in its rawest form is not likely to be grasped by every potential developer
  • language design implications
    • must be able to determine names statically
  • Compiler architecture is no longer linear? Didn't quite understand this part. There is quasi-quote mode and splicing mode.
  • AST design
    • heterogeneous vs. homogeneous -- he chooses heterogeneous, somewhat dismisses homogeneous
    • ASTs should be immutable! Because of aliasing I guess. Python's are mutable.
  • new extension: arbitrary DSLs compiling to converge code! yes.

Second Converge Paper

  • DSL embedding in Converge, which describes DSL blocks.
    • Converge Parsing Kit uses Earley Parsing, inspired by SPARK parser (used in original Python ASDL implementation)
      • quite slow at 1000 lines/second, or 1 line per millisecond!
    • src_info concept -- attributing errors to multiple locations
    • alpha renaming for hygiene
    • rewriting the tokenizer? You can use Converge's tokenizer, with a list of optional keywords, or you can provide your own tokenizer
    • example: ORM! Translating SQL schemas to Converge type definitions (or the opposite?).
    • didn't like his terminology of "heterogeneous" and "homogeneous", to mean 2-language vs. 1-language
      • this was even different than a paper cited

Use Cases

More: Metaprogramming Use Cases