DSL ~ Intro to the DSL Feature - uchicago-cs/chiventure GitHub Wiki

The DSL feature parses the Domain Specific Language of chiventure into the WDL that can be processed by chiventure. This feature is written in Python, which means that there are no header files that function as user-facing interfaces. However, we still separated our code into modules that serve distinct purposes.

Directory Structure

The chiventure/src/dsl/ directory is currently organized as such:

├── README.md
├── conda_environment.yml
├── src (.py files)
├── grammars (.lark files)
├── examples
│   ├── dsl (.dsl files)
│   └── wdl (.wdl files)

The README introduces the DSL and the example .dsl files. It also provides directions on how to install Lark and how to run the parser.

conda_environment.yml is a conda environment file that helps install Lark via a conda environment with the appropriate dependencies.

There are 3 directories: src, grammars, and examples:

  • grammars contains .lark files that define the grammar of the DSL for the Lark parser.

  • src contains Python files that parse the DSL into WDL.

  • examples contains two sub-directories, dsl and wdl, that respectively contain sample .dsl and .wdl files for testing and demonstration. Descriptions for the specific examples can be found in the README.

grammars: Grammar Definition

The grammars directory currently contains 3 .lark files: dsl_grammar.lark, util.lark, and vars.lark.

  • dsl_grammar.lark defines the DSL grammar in the acceptable format for Lark. Reference for this format can be found here.

  • util.lark defines "utility" grammars, like escaped strings and comments, that might be used in multiple different files. It also imports a few useful pre-defined definitions from Lark's common library.

  • vars.lark defines rules for using variables in the DSL. This provides Lark with instructions on how to recognize, match and replace variables when parsing the DSL.

src: Parsing

The src directory contains 4 Python files: dsl_parser.py, parse.py, to_wdl.py, and vars.py. Each of the files acts as a module in the code design.

  • parse.py is the main user-facing module that contains the the main() function that handles the entire process of translating the DSL to WDL. It calls on functions defined in the other three modules. It also handles opening the DSL file and outputting the generated WDL file, and includes support for command line flags that can be specified when using the parser.

  • vars.py uses Lark to evaluate the variables, if any, in the given DSL file and replace the variables with their assigned values. The main outward-facing function is evalVars(), which returns the DSL, as a string, with variables replaced by their respective values.

  • dsl_parser.py parses the DSL according to the grammar defined in grammars/dsl_grammar.lark. The main outward-facing function is export_dict(), which outputs a transformed syntax tree that acts as an intermediate stage between DSL and WDL. In the intermediate stage, the information in the DSL is extracted and loaded into Python data structures, which can then be easily manipulated into WDL.

  • to_wdl.py takes the intermediate stage output from the dsl_parser.py module and manipulates it into the final JSON/WDL format. The main outward-facing function is parsed_dict_to_json(), which returns the WDL in its JSON format.