AQM‐Workflow and Global Workflow (Community ‐ Operations version) for Generating EcFlow - TerrenceMcGuinness-NOAA/global-workflow GitHub Wiki
Workspace: Collecting workspace information# Global-Workflow ECFlow Implementation
The global-workflow project constructs ecFlow workflows using a combination of Python scripts and YAML configuration files. The implementation provides a flexible way to define complex workflow hierarchies with tasks, triggers, events, and other ecFlow components.
-
YAML Configuration: The workflow is primarily defined in YAML files (like ecflow_build.yml) that specify:
- Suites
- Families
- Tasks
- Triggers
- Events
- Repeats
- Edits (environment variables)
-
Python Framework: Several key Python modules handle the configuration parsing and ecFlow definition generation:
-
ecflow_setup.py
- Loads and processes the YAML configuration -
ecflow_definitions.py
- Defines classes that extend ecFlow objects -
setup_ecf.py
- Top-level script that parses arguments and initiates the workflow creation
-
The process follows these general steps:
- Parse command-line arguments to get paths to configuration files
- Load environment variables from config files
- Parse the YAML configuration file
- Create ecFlow suites, families, and tasks according to the YAML structure
- Add edits (environment variables) to nodes
- Add triggers, events, and other attributes to nodes
- Generate ecFlow definition files and script files
- Save the definition file
-
Ecflowsetup
- Main class that orchestrates the workflow generation -
Ecflowsuite
- Manages suites and handles node organization -
EcfNode
- Base class for handling ranges and lists in node names -
EcfTask
,EcfFamily
- Classes that extend ecFlow tasks and families
A powerful feature is the ability to use ranges and lists to create multiple similar nodes:
jgfs_atmos_post_f(4): # Creates 4 tasks: f000, f001, f002, f003
template: jgfs_atmos_post_master
The framework can copy template scripts for tasks:
- It looks for scripts in the
scriptrepo
directory - For tasks with templates, it copies the template script and renames it
- It generates the appropriate folder structure to match the suite hierarchy
To build and run an ecFlow workflow:
- Configure the workflow in a YAML file
- Run setup_ecf.py with appropriate arguments
- Load the generated definition into ecFlow server:
ecflow_client --load <def_file>
- Begin execution:
ecflow_client --begin <suite_name>
This structured approach allows for creating complex, hierarchical workflows with minimal repetition in the configuration.
TerrenceMcGuinness-NOAA: Now compare and contrast how an ecFlow workflow is instatated in the AQM-Workflow.
GitHub Copilot: # Comparison: ecFlow Workflow Implementation in AQM-Workflow vs Global-Workflow
The AQM-Workflow (Air Quality Model Workflow) and global-workflow projects both implement ecFlow workflows, but they have some key differences in their approaches.
-
YAML/JSON Configuration: AQM-Workflow uses both YAML and JSON configuration files:
- In
parm/config/aqm_config.yaml
for general configuration - Task-specific configuration in JSON format
- In
-
Python Framework: The
ush/python/build_workflow.py
script is the core workflow generator -
Jinja2 Templates: AQM-Workflow heavily uses Jinja2 templates to generate ecFlow scripts in
ecf/jinja2
- The workflow is built using a class-based Python approach defined in
build_workflow.py
- Configuration is read from the central YAML file
- A hierarchical structure is built using Python objects that correspond to ecFlow components
- Task scripts are generated from Jinja2 templates
- The complete workflow definition is written to ecFlow format
-
EcflowTaskMgr
- The main class that orchestrates workflow creation -
Suite
,Family
,Task
- Classes that represent ecFlow components -
Template
- Handles Jinja2 template rendering
Feature | AQM-Workflow | Global-Workflow |
---|---|---|
Configuration Format | Uses both YAML and JSON | Primarily YAML-based |
Templating | Heavy use of Jinja2 templates | Script-based templates |
Task Generation | Task scripts generated from templates | Template scripts copied and renamed |
Dependency Definition | Uses Python objects and configuration | Directly specified in YAML |
Environment Variables | Processed through templates | Added as "edits" in YAML |
Code Organization | More modular, object-oriented design | Configuration-driven approach |
Script Generation | Dynamic generation from templates | Static scripts with substitutions |
Both approaches:
- Use YAML for configuration
- Create a hierarchical representation of ecFlow components
- Support environment variables and triggers
- Generate definitions that can be loaded into ecFlow
- Provide flexible task dependencies
The AQM-Workflow creates tasks using code like:
# Creating a family
prep_family = Family(name="prep")
# Adding a task with template
task = Task(name="prep_aqm", template="prep_aqm.ecf.j2")
task.add_trigger("../prep_aqm_static == complete")
task.add_variable("FORECAST_HOUR", "${FCST_HOUR}")
prep_family.add_task(task)
This is more programmatic compared to Global-Workflow's more declarative approach in YAML.
- Global-Workflow: More declarative, YAML-driven approach that maps closely to ecFlow structure
- AQM-Workflow: More programmatic, with object-oriented design and dynamic Jinja2 template rendering
Both approaches are effective but reflect different design philosophies and requirements for their respective domains.
TerrenceMcGuinness-NOAA: Please discribe in detail what these differenct desing philosophies are and what are the strenghs and weaknesses
GitHub Copilot: # Design Philosophies of ecFlow Workflow Implementation: A Comparative Analysis
The global-workflow project follows a declarative, configuration-first philosophy where the workflow structure is primarily defined in YAML configuration files. This approach prioritizes having a single source of truth in configuration files that closely mirror the resulting ecFlow structure.
The AQM-Workflow adopts a more programmatic, object-oriented philosophy where Python classes represent workflow components, and Jinja2 templates generate the actual task scripts. This approach emphasizes flexibility through code and separation of structure (Python) from content (templates).
Strengths:
-
Transparency: The YAML structure directly mirrors the resulting ecFlow hierarchy, making it immediately clear what the workflow will look like.
-
Simplicity: For users familiar with ecFlow, the configuration is intuitive as it closely resembles ecFlow concepts.
-
**Reduce
TerrenceMcGuinness-NOAA: Please continue.
GitHub Copilot: # Design Philosophies: AQM-Workflow vs Global-Workflow
The global-workflow project follows a configuration-first philosophy where the workflow structure is primarily described in declarative YAML files. This approach:
- Treats the workflow definition as a configuration problem
- Emphasizes readability of the workflow structure in one place
- Uses a relatively thin translation layer between configuration and ecFlow objects
- Keeps logic separate from workflow structure
This design philosophy aligns with infrastructure-as-code practices, where the desired state is declared and a system interprets that declaration.
The AQM-Workflow follows a more programmatic philosophy where the workflow structure is built through object interactions in Python. This approach:
- Treats workflow definition as a programming problem
- Leverages object-oriented principles like inheritance and encapsulation
- Uses templates for generating variable content
- Embeds business logic directly in the workflow construction
This design philosophy aligns with application development practices, where complex behavior is modeled through interacting objects.
Strengths:
- Readability: The entire workflow structure can be understood from the YAML files
- Simplicity: Minimal code needed to understand how workflows are defined
- Maintainability: Changes to workflow structure require minimal code changes
- Separation of Concerns: Clear distinction between workflow structure and implementation logic
- Low Learning Curve: YAML is accessible to non-programmers and domain scientists
Weaknesses:
- Limited Expressiveness: Complex logic is difficult to express in purely declarative formats
- Indirection: Mapping between configuration and actual execution can be harder to trace
- Error Handling: Less robust handling of edge cases and validation
- Extensibility: Adding new features often requires modifying the parser/interpreter
- Verbosity: Complex workflows can lead to very large YAML files
Strengths:
- Flexibility: Full power of a programming language for complex logic
- Encapsulation: Object-oriented design hides implementation details
- Reusability: Classes can be extended and reused across different parts of the workflow
- Template-Based: Jinja2 templates provide powerful content generation capabilities
- Validation: Better opportunities for runtime validation and error checking
Weaknesses:
- Complexity: Higher barrier to entry for understanding the workflow
- Maintainability: Changes may require understanding object relationships
- Readability: Workflow structure is distributed across code rather than centralized
- Debugging: More difficult to diagnose issues in the generated workflow
- Dependencies: Relies on more external libraries (e.g., Jinja2)
Global-Workflow Style is Better When:
- Workflow structure changes frequently
- Non-programmers need to modify workflows
- Simplicity and readability are paramount
- Workflows follow predictable patterns
- Precise control over ecFlow structure is needed
AQM-Workflow Style is Better When:
- Complex logic determines workflow structure
- Many similar tasks need template-based generation
- Significant customization is needed per task
- Workflow components need to be reused across projects
- Runtime decisions affect workflow structure
The choice of philosophy affects how teams work with the workflows:
- Global-Workflow: Changes are primarily configuration edits, reducing the need for programming expertise but potentially limiting flexibility
- AQM-Workflow: Changes often involve code modifications, requiring programming skills but offering more power and expressiveness
Both approaches represent valid design decisions based on different priorities and constraints in their respective domains.
TerrenceMcGuinness-NOAA: How does each system use Rocoto as a workflow manager?
GitHub Copilot: # Rocoto Usage in Global-Workflow vs AQM-Workflow
Both Global-Workflow and AQM-Workflow have implementations for Rocoto workflow management, but they use it in different ways. Let's examine how each system incorporates Rocoto.
Global-Workflow includes a comprehensive Rocoto implementation that serves as an alternative to the ecFlow-based workflow system.
-
XML Template Generation:
- Uses Python to generate Rocoto workflow XML from templates
- Located in
workflow/rocoto
directory - Primary script:
rocoto_workflow_tools.py
-
Configuration:
- Uses the same configuration system as ecFlow workflows with YAML files
- Additional Rocoto-specific settings in config files like
config.base
- Configuration parameters are passed through the Python generation scripts
-
Workflow Structure:
- Divides workflows into cycles with multiple components (GFS, GDAS, etc.)
- Implements the same overall task structure as the ecFlow implementation
-
Task Scripts:
- Uses the same underlying task scripts as ecFlow
- Environment variables are set differently to accommodate Rocoto
The workflow is typically generated using:
./workflow/rocoto/rocoto_workflow_cron.py
This creates a complete XML document with <workflow>
, <log>
, <cycle>
, and <task>
elements defining the entire operational sequence.
AQM-Workflow offers a more limited Rocoto implementation focused specifically on the air quality modeling workflow needs.
-
XML Generation:
- Uses
ush/rocoto/create_aqm_rocoto_xml.py
to generate workflow XML - XML templates are much simpler compared to Global-Workflow
- Uses
-
Configuration:
- Uses the same configuration YAML files as the ecFlow implementation
- Converts specific configuration parameters to Rocoto format
-
Workflow Structure:
- More targeted to AQM-specific cycles and tasks
- Flatter task structure with fewer nested dependencies
-
Task Scripts:
- Leverages common driver scripts with different environment settings
- Similar approach to script generation as its ecFlow implementation
Aspect | Global-Workflow | AQM-Workflow |
---|---|---|
Completeness | Full-featured Rocoto implementation | More limited implementation |
XML Generation | Complex template system | Simpler Python-based generation |
Integration Level | Deep integration with configuration system | Lighter integration |
Task Definition | Detailed metatasks and dependencies | Simpler task relationships |
Maintenance | Maintained as primary alternative to ecFlow | Secondary to ecFlow implementation |
Development Focus | Extensive development resources | More focused implementation |
Strengths:
- Comprehensive implementation covering all workflow components
- Well-maintained with regular updates
- Robust dependency tracking
- Extensive documentation and usage examples
- Used in production environments
Weaknesses:
- More complex configuration process
- Higher learning curve
- More difficult to modify for specific use cases
Strengths:
- Simpler and more focused implementation
- Easier to understand for domain-specific users
- More flexible for AQM-specific modifications
- Lighter weight system overall
Weaknesses:
- Less comprehensive feature set
- Not as extensively tested in varied environments
- Fewer advanced Rocoto features utilized
-
Global-Workflow's Rocoto implementation is better suited for comprehensive operational modeling systems requiring robust cycle definitions and complex dependencies.
-
AQM-Workflow's Rocoto implementation is more appropriate for focused air quality modeling applications with simpler dependency structures.
Both systems demonstrate the flexibility of Rocoto as a workflow manager, but with different design philosophies reflecting their specific operational requirements.