Workflow Generation Architecture CROW to pyecflow Evolution - TerrenceMcGuinness-NOAA/global-workflow GitHub Wiki

Workflow Generation Architecture: CROW to pyecflow Evolution

Date: February 27, 2026 Context: Architectural analysis of workflow generation approaches across global-workflow and ecFlow tooling, derived from source code examination of global-workflow_crow, global-workflow (current), and pyecflow (families_tasks branch).


Glossary of Terms

Term Definition
CROW Configuration for Rocoto and ecFlow Workflows β€” a Python/YAML toolbox (NOAA-EMC/CROW.git) that generated both Rocoto XML and ecFlow .def/.ecf files from a single YAML source. Used in global-workflow circa 2018–2020.
pyecflow Python ecFlow framework (NOAA-EMC/pyecflow) β€” wraps the pyflow API to generate ecFlow suites. The families_tasks branch introduces YAML-driven workflow definition.
pyflow ECMWF's Python API for building ecFlow suites programmatically β€” provides Suite, AnchorFamily, Task, Variable objects that generate .def files and .ecf scripts.
Rocoto XML-driven workflow manager used on NOAA HPC systems β€” parses a workflow XML file defining tasks, dependencies, and resources, then manages job submission and retry.
SPOT Single Point of Truth β€” design principle that each piece of configuration should be defined in exactly one place.
metascheduler CROW's backend abstraction layer (CROW/crow/metascheduler/) β€” contains rocoto.py and ecflow.py modules that render the same YAML workflow tree into different output formats.
!Depend CROW's scheduler-agnostic dependency expression language β€” supports & (AND), | (OR), cross-cycle references (forecast.at('-6:00:00')), event references, and negation (~). Rendered to <taskdep> XML for Rocoto or trigger statements for ecFlow.
!Family / !Task / !Cycle CROW's custom YAML tags β€” registered as Python type constructors, these represent workflow nodes without depending on any ecflow or pyflow library.
!calc / !expand / !FirstTrue CROW's YAML expression system β€” embedded Python evaluation (!calc), string interpolation (!expand), and conditional selection (!FirstTrue). Powerful but hard to debug.
generate_tree() pyecflow's WorkflowSuite method that recursively walks a nested dict (from YAML) and builds pyflow AnchorFamily/Task objects. The core of pyecflow's declarative approach.
_extract_family_hierarchy() pyecflow's WorkflowSuite static method that flattens a nested YAML config into a path→children mapping for family creation.
WorkflowTask pyecflow class wrapping pf.Task β€” accepts a context dict with variables and script keys matching YAML structure.
WorkflowAnchorFamily pyecflow class wrapping pf.AnchorFamily β€” validates context type and extracts variables from YAML-sourced config.

Three Generations of Workflow Generation

Generation 1: CROW (2018–2020)

Approach: Declarative YAML β†’ dual text generation (Rocoto XML + ecFlow .def/.ecf)

Architecture:

workflow/layout/*.yaml          (suite structure: !Family, !Task, !Depend)
workflow/runtime/task.yaml      (dual-backend task template)
workflow/config/*.yaml          (task-specific settings)
workflow/schema/task.yaml       (property validation via !Template)
        β”‚
        β–Ό
CROW/crow/metascheduler/
  β”œβ”€β”€ rocoto.py  β†’ PSLOT.xml + PSLOT.crontab
  └── ecflow.py  β†’ suite.def + per-task .ecf scripts

Key Design Decisions:

  • YAML as single source of truth for workflow structure, dependencies, resources, and variables
  • Custom YAML tags (!Family, !Task, !Depend, !Cycle) as scheduler-agnostic abstractions β€” not dependent on ecflow or pyflow Python packages
  • metasched polymorphism object: metasched.type returns backend name, metasched.varref() renders &ENTITY; (Rocoto) or %VAR% (ecFlow), metasched.defenvar() renders <envar> or edit
  • Each task carried parallel properties: Rocoto: (XML body), ecf_file: (script content), rocoto_command:, ecflow_command:
  • Generated ecFlow artifacts as plain text strings β€” no ecflow Python package needed

What Worked:

  • True SPOT β€” one YAML defined the full workflow for both schedulers
  • Clean dependency expression language (!Depend) with cross-cycle and event support
  • Platform abstraction via workflow/platforms/*.yaml

What Didn't:

  • !calc / !expand / !FirstTrue created a Turing-complete language inside YAML β€” hard to debug, hard for new developers to learn
  • Dual-backend rendering doubled the surface area of every task definition
  • Text-based .def/.ecf generation couldn't leverage ecFlow server APIs for runtime interaction
  • The complexity burden outweighed the benefit for teams that only used Rocoto

Generation 2: Current global-workflow Rocoto (2020–present)

Approach: Imperative Python β†’ Rocoto XML only

Architecture:

dev/workflow/rocoto/
  β”œβ”€β”€ gfs_tasks.py   (GFSTasks class β€” each task is a Python method)
  β”œβ”€β”€ tasks.py        (Tasks base class β€” VALID_TASKS, resources, env vars)
  β”œβ”€β”€ rocoto.py       (XML string generators β€” create_task(), add_dependency())
  └── rocoto_xml.py   (RocotoXML assembler β€” preamble + defs + cycledefs + tasks)

Key Design Decisions:

  • Dropped ecFlow generation entirely β€” Rocoto-only
  • Each task defined as a Python method that builds a task_dict and calls rocoto.create_task(task_dict)
  • Dependencies declared imperatively alongside task definitions
  • No YAML configuration layer β€” everything in Python

What Worked:

  • Simpler to understand β€” Python is Python, no custom YAML DSL
  • Direct control over XML output
  • Easy to add conditional logic without YAML gymnastics

What's Limited:

  • No path back to ecFlow without rewriting
  • Task definitions mix structure (what to run) with rendering (how to make XML) β€” violates separation of concerns
  • Adding a new task requires writing a Python method, not editing a config file

Generation 3: pyecflow families_tasks branch (2025–present)

Approach: Declarative YAML β†’ pyflow Python API β†’ ecFlow .def/.ecf

Architecture:

tests/data/*.yaml               (YAML configs: families, tasks, variables, scripts)
        β”‚
        β–Ό
pyecflow/
  β”œβ”€β”€ workflow_suite.py          (WorkflowSuite β€” generate_tree() + generate_suite())
  β”œβ”€β”€ workflow_anchorfamily.py   (WorkflowAnchorFamily β€” context validation)
  └── workflow_task.py           (WorkflowTask β€” variables + script from context dict)
        β”‚
        β–Ό
pyflow API (pf.Suite, pf.AnchorFamily, pf.Task)
        β”‚
        β–Ό
.def file + .ecf scripts + deploy paths

What's Working:

  • YAML defines family hierarchy, tasks, variables, and scripts declaratively
  • generate_tree() recursively walks nested dicts from yaml.safe_load()
  • Uses pyflow's Python API (not text generation) β€” gets deploy paths, validation, server interaction for free
  • Path-based family disambiguation (e.g., family_A/shared_child vs family_B/shared_child)

What's Missing (relative to CROW's completeness):

  • No triggers/dependencies in YAML β€” biggest SPOT gap
  • No events, meters, or labels in YAML schema
  • No resources/walltime/queue in context dict
  • No repeat/time/cron definitions
  • No YAML schema validation (jsonschema or pydantic)

CROW's Dual-Backend Mechanism (Detail)

CROW supported both backends without importing ecflow or pyflow by treating both as text-file formats:

ecFlow .def format β€” structured text:

suite prod00
  family gfs
    task jgfs_atmos_analysis
      trigger ../obsproc/prep/jgfs_atmos_prep == complete
      edit model 'gfs'
  endfamily
endsuite

ecFlow .ecf format β€” shell scripts with %VAR% markers:

%include <head.h>
export cyc=%CYC%
$HOMEgfs/jobs/JGLOBAL_ANALYSIS
%include <tail.h>

Rocoto XML β€” workflow definition:

<task name="jgdas_analysis" cycledefs="gdas" maxtries="2">
  <command>sh -c '...'</command>
  <envar><name>CDATE</name><value><cyclestr>@Y@m@d@H</cyclestr></value></envar>
  <dependency><taskdep task="jgdas_prep"/></dependency>
</task>

Both are just text. CROW generated them with Python string operations β€” !expand for interpolation, metasched.* for backend-polymorphic rendering.

Dependency translation example:

# CROW YAML (scheduler-agnostic)
Trigger: !Depend ( up.prep.jgdas_prep & up.prep.jgdas_emcsfc_sfc_prep )

Became:

  • Rocoto: <dependency><and><taskdep task="jgdas_prep"/><taskdep task="jgdas_emcsfc_sfc_prep"/></and></dependency>
  • ecFlow: trigger ../prep/jgdas_prep == complete and ../prep/jgdas_emcsfc_sfc_prep == complete

Lessons for pyecflow's Roadmap

Adopt from CROW

Feature CROW Implementation Recommended pyecflow Approach
Dependencies in YAML !Depend DSL with custom parser Simple string or structured dict in YAML, parsed by generate_tree() into pf.Trigger calls
Events in YAML !DataEvent tag Add events: key to task context dict β†’ create pf.Event objects
Resources in YAML resources: !calc partition.resources.run_anal Add resources: key with walltime, nodes, tasks_per_node
Cross-cycle references forecast.at('-6:00:00') Leverage pyflow's Trigger with extern nodes
Schema validation !Template YAML tag jsonschema or pydantic model for the YAML config structure

Do Not Adopt from CROW

Anti-Pattern Why It Failed pyecflow Alternative
!calc / !expand / !FirstTrue expression language inside YAML Created an opaque mini-language; hard to debug, hard to onboard developers Keep YAML as pure data; Python handles all logic
Dual-backend rendering (Rocoto + ecFlow) Doubled complexity for every task definition ecFlow-only via pyflow API β€” correct scope
Text-based .def/.ecf generation Couldn't leverage ecFlow server features, deploy path management, or validation Use pyflow's pf.Suite / pf.AnchorFamily / pf.Task objects
Shared j-jobs across backends Forced lowest-common-denominator job scripts pyecflow generates proper .ecf scripts with ecFlow-native features

The Winning Formula

Declarative YAML for data (hierarchy, dependencies, variables, resources, events) + Python for logic (conditional task creation, platform decisions, validation) + pyflow API for generation (deploy paths, server interaction, .def/.ecf output)

This is where generate_tree(yaml_config) is heading β€” it needs the YAML schema expanded to cover triggers, events, and repeats while keeping Python responsible for interpreting them through pyflow's API.


Priority Roadmap for pyecflow YAML Schema Extension

Priority Feature Effort Impact
1 Triggers/dependencies in YAML Medium Eliminates the biggest SPOT violation
2 Events in YAML context dict Small generate_tree() already recurses β€” add pf.Event creation
3 Resource/walltime/queue in context Small Extends context dict schema
4 Repeat/time attributes Medium Needs new YAML keys + pyflow repeat API
5 YAML schema validation Small Prevents silent misconfiguration

Source References

All analysis derived from source code in the workspace:

  • CROW layout files: supported_repos/global-workflow_crow/workflow/layout/cycled_gfs.yaml, public_release_v1.yaml, free_forecast_gfs.yaml
  • CROW task template: supported_repos/global-workflow_crow/workflow/runtime/task.yaml
  • CROW task schema: supported_repos/global-workflow_crow/workflow/schema/task.yaml
  • CROW suite runtime: supported_repos/global-workflow_crow/workflow/runtime/suite.yaml
  • CROW generated artifacts: supported_repos/global-workflow_crow/ecflow/ecf/defs/prod00.def, ecflow/ecf/scripts/ (421 .ecf files)
  • Current Rocoto generation: supported_repos/global-workflow/dev/workflow/rocoto/gfs_tasks.py, tasks.py, rocoto.py, rocoto_xml.py
  • pyecflow source: supported_repos/ECFLOW/pyecflow/pyecflow/workflow_suite.py, workflow_anchorfamily.py, workflow_task.py
  • pyecflow test configs: supported_repos/ECFLOW/pyecflow/tests/data/task_test_config.yaml, fam_test_config.yaml, duplicate_name_config.yaml
  • pyecflow tests: supported_repos/ECFLOW/pyecflow/tests/test_workflow_task.py, test_workflow_anchorfamily.py, test_workflow_suite.py
⚠️ **GitHub.com Fallback** ⚠️