AQM‐Workflow and Global Workflow (Community ‐ Operations version) for Generating EcFlow - TerrenceMcGuinness-NOAA/global-workflow GitHub Wiki

Workspace: Collecting workspace information# Global-Workflow ECFlow Implementation

The global-workflow project constructs ecFlow workflows using a combination of Python scripts and YAML configuration files. The implementation provides a flexible way to define complex workflow hierarchies with tasks, triggers, events, and other ecFlow components.

Core Components

  1. YAML Configuration: The workflow is primarily defined in YAML files (like ecflow_build.yml) that specify:

    • Suites
    • Families
    • Tasks
    • Triggers
    • Events
    • Repeats
    • Edits (environment variables)
  2. Python Framework: Several key Python modules handle the configuration parsing and ecFlow definition generation:

    • ecflow_setup.py - Loads and processes the YAML configuration
    • ecflow_definitions.py - Defines classes that extend ecFlow objects
    • setup_ecf.py - Top-level script that parses arguments and initiates the workflow creation

Workflow Construction Process

The process follows these general steps:

  1. Parse command-line arguments to get paths to configuration files
  2. Load environment variables from config files
  3. Parse the YAML configuration file
  4. Create ecFlow suites, families, and tasks according to the YAML structure
  5. Add edits (environment variables) to nodes
  6. Add triggers, events, and other attributes to nodes
  7. Generate ecFlow definition files and script files
  8. Save the definition file

Key Classes

  • Ecflowsetup - Main class that orchestrates the workflow generation
  • Ecflowsuite - Manages suites and handles node organization
  • EcfNode - Base class for handling ranges and lists in node names
  • EcfTask, EcfFamily - Classes that extend ecFlow tasks and families

Ranges and Lists

A powerful feature is the ability to use ranges and lists to create multiple similar nodes:

jgfs_atmos_post_f(4):  # Creates 4 tasks: f000, f001, f002, f003
  template: jgfs_atmos_post_master

Script Handling

The framework can copy template scripts for tasks:

  1. It looks for scripts in the scriptrepo directory
  2. For tasks with templates, it copies the template script and renames it
  3. It generates the appropriate folder structure to match the suite hierarchy

Running a Workflow

To build and run an ecFlow workflow:

  1. Configure the workflow in a YAML file
  2. Run setup_ecf.py with appropriate arguments
  3. Load the generated definition into ecFlow server: ecflow_client --load <def_file>
  4. Begin execution: ecflow_client --begin <suite_name>

This structured approach allows for creating complex, hierarchical workflows with minimal repetition in the configuration.

TerrenceMcGuinness-NOAA: Now compare and contrast how an ecFlow workflow is instatated in the AQM-Workflow.

GitHub Copilot: # Comparison: ecFlow Workflow Implementation in AQM-Workflow vs Global-Workflow

The AQM-Workflow (Air Quality Model Workflow) and global-workflow projects both implement ecFlow workflows, but they have some key differences in their approaches.

AQM-Workflow Approach

Core Components

  1. YAML/JSON Configuration: AQM-Workflow uses both YAML and JSON configuration files:

    • In parm/config/aqm_config.yaml for general configuration
    • Task-specific configuration in JSON format
  2. Python Framework: The ush/python/build_workflow.py script is the core workflow generator

  3. Jinja2 Templates: AQM-Workflow heavily uses Jinja2 templates to generate ecFlow scripts in ecf/jinja2

Workflow Construction Process

  1. The workflow is built using a class-based Python approach defined in build_workflow.py
  2. Configuration is read from the central YAML file
  3. A hierarchical structure is built using Python objects that correspond to ecFlow components
  4. Task scripts are generated from Jinja2 templates
  5. The complete workflow definition is written to ecFlow format

Key Classes

  • EcflowTaskMgr - The main class that orchestrates workflow creation
  • Suite, Family, Task - Classes that represent ecFlow components
  • Template - Handles Jinja2 template rendering

Key Differences

Feature AQM-Workflow Global-Workflow
Configuration Format Uses both YAML and JSON Primarily YAML-based
Templating Heavy use of Jinja2 templates Script-based templates
Task Generation Task scripts generated from templates Template scripts copied and renamed
Dependency Definition Uses Python objects and configuration Directly specified in YAML
Environment Variables Processed through templates Added as "edits" in YAML
Code Organization More modular, object-oriented design Configuration-driven approach
Script Generation Dynamic generation from templates Static scripts with substitutions

Similarities

Both approaches:

  1. Use YAML for configuration
  2. Create a hierarchical representation of ecFlow components
  3. Support environment variables and triggers
  4. Generate definitions that can be loaded into ecFlow
  5. Provide flexible task dependencies

Example from AQM-Workflow

The AQM-Workflow creates tasks using code like:

# Creating a family
prep_family = Family(name="prep")

# Adding a task with template
task = Task(name="prep_aqm", template="prep_aqm.ecf.j2")
task.add_trigger("../prep_aqm_static == complete")
task.add_variable("FORECAST_HOUR", "${FCST_HOUR}")
prep_family.add_task(task)

This is more programmatic compared to Global-Workflow's more declarative approach in YAML.

Summary

  • Global-Workflow: More declarative, YAML-driven approach that maps closely to ecFlow structure
  • AQM-Workflow: More programmatic, with object-oriented design and dynamic Jinja2 template rendering

Both approaches are effective but reflect different design philosophies and requirements for their respective domains.

TerrenceMcGuinness-NOAA: Please discribe in detail what these differenct desing philosophies are and what are the strenghs and weaknesses

GitHub Copilot: # Design Philosophies of ecFlow Workflow Implementation: A Comparative Analysis

Design Philosophy Comparison

Global-Workflow: Configuration-Driven Approach

The global-workflow project follows a declarative, configuration-first philosophy where the workflow structure is primarily defined in YAML configuration files. This approach prioritizes having a single source of truth in configuration files that closely mirror the resulting ecFlow structure.

AQM-Workflow: Object-Oriented Programming Approach

The AQM-Workflow adopts a more programmatic, object-oriented philosophy where Python classes represent workflow components, and Jinja2 templates generate the actual task scripts. This approach emphasizes flexibility through code and separation of structure (Python) from content (templates).

Detailed Analysis of Strengths and Weaknesses

Global-Workflow Approach

Strengths:

  1. Transparency: The YAML structure directly mirrors the resulting ecFlow hierarchy, making it immediately clear what the workflow will look like.

  2. Simplicity: For users familiar with ecFlow, the configuration is intuitive as it closely resembles ecFlow concepts.

  3. **Reduce

TerrenceMcGuinness-NOAA: Please continue.

GitHub Copilot: # Design Philosophies: AQM-Workflow vs Global-Workflow

Design Philosophy Differences

Global-Workflow: Declarative Configuration Approach

The global-workflow project follows a configuration-first philosophy where the workflow structure is primarily described in declarative YAML files. This approach:

  • Treats the workflow definition as a configuration problem
  • Emphasizes readability of the workflow structure in one place
  • Uses a relatively thin translation layer between configuration and ecFlow objects
  • Keeps logic separate from workflow structure

This design philosophy aligns with infrastructure-as-code practices, where the desired state is declared and a system interprets that declaration.

AQM-Workflow: Object-Oriented Programmatic Approach

The AQM-Workflow follows a more programmatic philosophy where the workflow structure is built through object interactions in Python. This approach:

  • Treats workflow definition as a programming problem
  • Leverages object-oriented principles like inheritance and encapsulation
  • Uses templates for generating variable content
  • Embeds business logic directly in the workflow construction

This design philosophy aligns with application development practices, where complex behavior is modeled through interacting objects.

Strengths and Weaknesses

Global-Workflow Approach

Strengths:

  1. Readability: The entire workflow structure can be understood from the YAML files
  2. Simplicity: Minimal code needed to understand how workflows are defined
  3. Maintainability: Changes to workflow structure require minimal code changes
  4. Separation of Concerns: Clear distinction between workflow structure and implementation logic
  5. Low Learning Curve: YAML is accessible to non-programmers and domain scientists

Weaknesses:

  1. Limited Expressiveness: Complex logic is difficult to express in purely declarative formats
  2. Indirection: Mapping between configuration and actual execution can be harder to trace
  3. Error Handling: Less robust handling of edge cases and validation
  4. Extensibility: Adding new features often requires modifying the parser/interpreter
  5. Verbosity: Complex workflows can lead to very large YAML files

AQM-Workflow Approach

Strengths:

  1. Flexibility: Full power of a programming language for complex logic
  2. Encapsulation: Object-oriented design hides implementation details
  3. Reusability: Classes can be extended and reused across different parts of the workflow
  4. Template-Based: Jinja2 templates provide powerful content generation capabilities
  5. Validation: Better opportunities for runtime validation and error checking

Weaknesses:

  1. Complexity: Higher barrier to entry for understanding the workflow
  2. Maintainability: Changes may require understanding object relationships
  3. Readability: Workflow structure is distributed across code rather than centralized
  4. Debugging: More difficult to diagnose issues in the generated workflow
  5. Dependencies: Relies on more external libraries (e.g., Jinja2)

When to Choose Each Approach

Global-Workflow Style is Better When:

  • Workflow structure changes frequently
  • Non-programmers need to modify workflows
  • Simplicity and readability are paramount
  • Workflows follow predictable patterns
  • Precise control over ecFlow structure is needed

AQM-Workflow Style is Better When:

  • Complex logic determines workflow structure
  • Many similar tasks need template-based generation
  • Significant customization is needed per task
  • Workflow components need to be reused across projects
  • Runtime decisions affect workflow structure

Impact on Development and Maintenance

The choice of philosophy affects how teams work with the workflows:

  • Global-Workflow: Changes are primarily configuration edits, reducing the need for programming expertise but potentially limiting flexibility
  • AQM-Workflow: Changes often involve code modifications, requiring programming skills but offering more power and expressiveness

Both approaches represent valid design decisions based on different priorities and constraints in their respective domains.

TerrenceMcGuinness-NOAA: How does each system use Rocoto as a workflow manager?

GitHub Copilot: # Rocoto Usage in Global-Workflow vs AQM-Workflow

Both Global-Workflow and AQM-Workflow have implementations for Rocoto workflow management, but they use it in different ways. Let's examine how each system incorporates Rocoto.

Global-Workflow Rocoto Implementation

Overview

Global-Workflow includes a comprehensive Rocoto implementation that serves as an alternative to the ecFlow-based workflow system.

Key Components

  1. XML Template Generation:

    • Uses Python to generate Rocoto workflow XML from templates
    • Located in workflow/rocoto directory
    • Primary script: rocoto_workflow_tools.py
  2. Configuration:

    • Uses the same configuration system as ecFlow workflows with YAML files
    • Additional Rocoto-specific settings in config files like config.base
    • Configuration parameters are passed through the Python generation scripts
  3. Workflow Structure:

    • Divides workflows into cycles with multiple components (GFS, GDAS, etc.)
    • Implements the same overall task structure as the ecFlow implementation
  4. Task Scripts:

    • Uses the same underlying task scripts as ecFlow
    • Environment variables are set differently to accommodate Rocoto

Example Rocoto XML Generation

The workflow is typically generated using:

./workflow/rocoto/rocoto_workflow_cron.py

This creates a complete XML document with <workflow>, <log>, <cycle>, and <task> elements defining the entire operational sequence.

AQM-Workflow Rocoto Implementation

Overview

AQM-Workflow offers a more limited Rocoto implementation focused specifically on the air quality modeling workflow needs.

Key Components

  1. XML Generation:

    • Uses ush/rocoto/create_aqm_rocoto_xml.py to generate workflow XML
    • XML templates are much simpler compared to Global-Workflow
  2. Configuration:

    • Uses the same configuration YAML files as the ecFlow implementation
    • Converts specific configuration parameters to Rocoto format
  3. Workflow Structure:

    • More targeted to AQM-specific cycles and tasks
    • Flatter task structure with fewer nested dependencies
  4. Task Scripts:

    • Leverages common driver scripts with different environment settings
    • Similar approach to script generation as its ecFlow implementation

Key Differences

Aspect Global-Workflow AQM-Workflow
Completeness Full-featured Rocoto implementation More limited implementation
XML Generation Complex template system Simpler Python-based generation
Integration Level Deep integration with configuration system Lighter integration
Task Definition Detailed metatasks and dependencies Simpler task relationships
Maintenance Maintained as primary alternative to ecFlow Secondary to ecFlow implementation
Development Focus Extensive development resources More focused implementation

Strengths and Weaknesses

Global-Workflow Rocoto Implementation

Strengths:

  1. Comprehensive implementation covering all workflow components
  2. Well-maintained with regular updates
  3. Robust dependency tracking
  4. Extensive documentation and usage examples
  5. Used in production environments

Weaknesses:

  1. More complex configuration process
  2. Higher learning curve
  3. More difficult to modify for specific use cases

AQM-Workflow Rocoto Implementation

Strengths:

  1. Simpler and more focused implementation
  2. Easier to understand for domain-specific users
  3. More flexible for AQM-specific modifications
  4. Lighter weight system overall

Weaknesses:

  1. Less comprehensive feature set
  2. Not as extensively tested in varied environments
  3. Fewer advanced Rocoto features utilized

Use Case Considerations

  • Global-Workflow's Rocoto implementation is better suited for comprehensive operational modeling systems requiring robust cycle definitions and complex dependencies.

  • AQM-Workflow's Rocoto implementation is more appropriate for focused air quality modeling applications with simpler dependency structures.

Both systems demonstrate the flexibility of Rocoto as a workflow manager, but with different design philosophies reflecting their specific operational requirements.

⚠️ **GitHub.com Fallback** ⚠️