MCP_TOOL_ARCHITECTURE - TerrenceMcGuinness-NOAA/global-workflow GitHub Wiki

MCP Tool Architecture: Enhancing AI-Assisted Development

Version: 3.0.0
Last Updated: November 4, 2025
Architecture: Week 2 Consolidated Design

Executive Summary

The Global Workflow MCP (Model Context Protocol) server provides 21 specialized tools organized into 5 functional categories, creating a comprehensive knowledge retrieval and code analysis platform. This architecture transforms GitHub Copilot from a code completion tool into a fully-functional integrated agentic software development platform with deep domain expertise in the NOAA Global Workflow system.


Tool Inventory by Category

1. WorkflowInfoTools (3 tools)

Static, filesystem-based tools providing instant access to workflow structure:

  • get_workflow_structure - System architecture and component overview
  • get_system_configs - HPC platform-specific configurations
  • describe_component - Quick component file system description

2. CodeAnalysisTools (4 tools)

Graph-based code relationship and dependency analysis:

  • analyze_code_structure - File/function/class structural analysis
  • find_dependencies - Dependency mapping (upstream/downstream/both)
  • trace_execution_path - Call chain traversal from starting function
  • find_callers_callees - Function relationship analysis

3. SemanticSearchTools (7 tools)

RAG-enhanced semantic search and compliance verification:

  • search_documentation - Hybrid semantic + graph search across all docs
  • search_ee2_standards - EE2 compliance standards search
  • explain_with_context - Contextual explanations using RAG knowledge base
  • analyze_ee2_compliance - EE2 compliance analysis for code/docs
  • generate_compliance_report - Comprehensive compliance reports
  • get_knowledge_base_status - System health check (vector + graph DB)
  • find_related_files - Vector-based file similarity and relationship discovery

4. OperationalTools (3 tools)

Deep operational context and HPC-specific guidance:

  • list_job_scripts - Complete inventory of workflow job scripts
  • explain_workflow_component - Deep component analysis with graph enrichment
  • get_operational_guidance - HPC operational procedures and best practices

5. GitHubTools (4 tools)

Cross-repository integration and project tracking:

  • analyze_workflow_dependencies - Graph-based workflow dependency analysis
  • search_issues - GitHub issues search for troubleshooting
  • get_pull_requests - Pull request information and changes
  • analyze_repository_structure - Multi-repository structure analysis

Total: 21 Tools


Configuration Modes

The MCP server supports three deployment modes, each tailored to specific use cases:

Mode Tools Use Case
full 21 tools (all) Complete development environment with full RAG and GitHub integration
core 7 tools (Workflow + Code) Minimal footprint, no external dependencies, fast startup
rag 17 tools (excludes GitHub) RAG-enhanced development without GitHub API requirements

Category 1: WorkflowInfoTools - Foundation Layer

Purpose and Motivation

The WorkflowInfoTools module provides the foundation layer of the MCP architecture - fast, reliable, zero-dependency access to static workflow information. These tools require no database connections, no vector embeddings, and no external APIs. They are always available, always fast (<10ms response time), and provide the baseline context that higher-level tools build upon.

The Value Proposition

In traditional AI-assisted development, LLMs lack structured knowledge about complex systems like the Global Workflow. They can generate code, but they don't understand the operational context, platform constraints, or system architecture. WorkflowInfoTools solves this by giving the LLM instant structural awareness.

When Copilot asks "What's the structure of this system?", it gets a complete architectural overview in milliseconds. When it asks "What modules are available on Hera?", it receives accurate, up-to-date HPC configuration data. This transforms the LLM from a generic code generator into a domain-aware development partner.

Tool Deep Dive

get_workflow_structure

What it does: Returns a comprehensive JSON representation of the Global Workflow system architecture, including:

  • Directory structure (jobs/, scripts/, parm/, ush/, sorc/)
  • Component relationships (UFS Weather Model, GSI, GFS Utils)
  • Workflow orchestration patterns (Rocoto XML, wxflow Python)
  • Configuration hierarchy (base configs → platform configs → experiment configs)

Why it matters: This tool gives the LLM a mental map of the entire system. When asked to add a new job script, Copilot knows exactly where it should go. When debugging a configuration issue, it understands the config hierarchy. When explaining the workflow to a new developer, it can provide accurate architectural context.

Agentic Enhancement: The LLM can now autonomously navigate complex systems by understanding "where things go" and "how things connect." This is the difference between a tool that generates code and an agent that understands systems.

get_system_configs

What it does: Provides platform-specific configurations for NOAA HPC systems:

  • Module paths and versions (Hera, Orion, WCOSS2, Jet, Gaea, S4)
  • Resource limits (compute nodes, memory, walltime)
  • Filesystem paths (/scratch, /work, /archive)
  • Scheduler configurations (Slurm, PBS)

Why it matters: HPC development is platform-specific. Code that works on Hera may fail on WCOSS2 due to different module versions or filesystem layouts. This tool ensures the LLM generates platform-appropriate code, scripts, and configurations.

Agentic Enhancement: When a developer says "create a batch script for Orion," the LLM automatically uses the correct Slurm parameters, module paths, and filesystem locations. It's not just generating boilerplate - it's generating deployable, platform-correct code.

describe_component

What it does: Quick filesystem-based inspection of any workflow component:

  • File existence and location
  • Basic metadata (size, modification time, permissions)
  • Quick content preview (first/last N lines)
  • Directory listings for component folders

Why it matters: This is the "sanity check" tool. Before running expensive semantic searches or graph queries, the LLM can quickly verify "does this file exist?" or "what's in this directory?" It's the equivalent of a human developer running ls and head - basic reconnaissance before deeper investigation.

Agentic Enhancement: Enables exploratory behavior. The LLM can navigate the filesystem, discover components, and build context incrementally. This is crucial for autonomous problem-solving where the agent must find information before it can use it.


Category 2: CodeAnalysisTools - Relationship Intelligence

Purpose and Motivation

Modern weather forecasting systems like Global Workflow consist of thousands of interconnected files, functions, and modules. Understanding these relationships is crucial for:

  • Impact Analysis: "If I change this function, what breaks?"
  • Dependency Management: "What does this module depend on?"
  • Code Navigation: "How do I get from the job script to the actual computation?"
  • Refactoring Safety: "Can I safely remove this code?"

Traditional LLMs struggle with this because they work on individual files or limited context windows. They can't see the graph structure of the codebase. CodeAnalysisTools solves this by leveraging Neo4j graph database to maintain a living map of code relationships.

The Value Proposition

These tools transform the LLM from a "file editor" into a "system architect." Instead of making isolated changes, Copilot can now:

  • Trace execution flows across multiple files
  • Identify all dependencies before suggesting a refactor
  • Find similar patterns in the codebase
  • Understand the ripple effects of changes

This is architectural intelligence - the ability to reason about systems, not just syntax.

Tool Deep Dive

analyze_code_structure

What it does: Deep structural analysis of Python, Bash, and other code files:

  • Function and class definitions
  • Import/dependency declarations
  • Call graphs (what calls what)
  • Complexity metrics (lines of code, cyclomatic complexity)
  • Documentation coverage

Why it matters: When an LLM needs to modify a file, it must understand its internal structure. This tool provides that understanding. It answers questions like:

  • "What functions are in this module?"
  • "What does this script import?"
  • "How complex is this code?"
  • "Is this function documented?"

Agentic Enhancement: Enables informed code modifications. The LLM knows which functions to preserve, which imports to maintain, and which patterns to follow. It can suggest refactorings that respect the existing structure rather than breaking it.

Real-world example: Developer asks "refactor exglobal_forecast.py to use async/await." The tool first analyzes the file structure, identifies synchronous patterns, maps dependencies, and then generates a refactoring plan that maintains all existing interfaces.

find_dependencies

What it does: Multi-directional dependency analysis:

  • Upstream: What does this file/module import or depend on?
  • Downstream: What depends on this file/module?
  • Both: Complete dependency graph in both directions
  • Configurable depth: Shallow (1 level) or deep (N levels) traversal

Why it matters: Dependencies are the skeleton of software systems. Before making any change, you must know what depends on what. This tool prevents breaking changes by making dependencies explicit and traceable.

Agentic Enhancement: The LLM can now perform safe refactoring. Before suggesting a change, it checks dependencies. Before removing a function, it verifies nothing calls it. Before updating an interface, it identifies all callers that need updates.

Real-world example: Developer asks "can we remove this old utility function?" The tool checks downstream dependencies, finds 12 callers across 5 modules, and reports "This function is still in use - here are all the places that call it."

trace_execution_path

What it does: Follow the execution flow from a starting function:

  • Build call chain: function A → calls → function B → calls → function C
  • Configurable depth: trace 1 level or trace N levels deep
  • Cross-file tracing: follows calls across module boundaries
  • Optional reverse tracing: find all paths that lead TO a function

Why it matters: Understanding how code executes is fundamental to debugging, optimization, and feature development. This tool lets the LLM answer questions like:

  • "What happens when I run this job?"
  • "Where does this data come from?"
  • "Why is this function being called?"
  • "What's the full execution path from entry point to this bug?"

Agentic Enhancement: Enables execution reasoning. The LLM can trace bugs to their source, identify performance bottlenecks in the call chain, and explain system behavior by following code paths.

Real-world example: Developer reports "the forecast job is slow." The tool traces the execution path, identifies a nested loop in the middle of the call chain, and pinpoints the bottleneck function for optimization.

find_callers_callees

What it does: Bidirectional function relationship analysis:

  • Callers: Find all functions that call this function
  • Callees: Find all functions that this function calls
  • Include source: Optionally include code snippets showing the call sites

Why it matters: This is the micro-level complement to trace_execution_path (which provides macro-level flows). When you need to understand the immediate neighborhood of a function - who calls it and what it calls - this tool provides instant answers.

Agentic Enhancement: Enables local reasoning about global impact. The LLM can answer questions like:

  • "If I change this function's signature, what breaks?"
  • "What functions should I look at to understand how this works?"
  • "Is this function a leaf (no callees) or a hub (many callers)?"

Real-world example: Developer asks "how is the GSI analysis called?" The tool finds all callers of the GSI entry point, shows the call sites with context, and reveals the various pathways that trigger analysis.


Category 3: SemanticSearchTools - Knowledge Retrieval Intelligence

Purpose and Motivation

The Global Workflow system contains thousands of pages of documentation, code comments, configuration files, and institutional knowledge. Traditional keyword search fails because:

  • Terminology varies: "forecast" vs "prediction" vs "projection"
  • Concepts are implicit: documentation about "data assimilation" doesn't always contain those exact words
  • Context matters: "cycle" means different things in different contexts
  • Relationships are hidden: related concepts aren't always linked

SemanticSearchTools leverages vector embeddings and graph-enhanced retrieval to enable natural language understanding of documentation and code. This is RAG (Retrieval-Augmented Generation) - the LLM retrieves relevant knowledge before generating responses.

The Value Proposition

These tools transform the LLM from "trained on old data" to "connected to current knowledge." Every query is grounded in:

  • The actual codebase (not hallucinated examples)
  • Current documentation (not outdated training data)
  • EE2 compliance standards (verified and authoritative)
  • Related context (graph-enriched for deeper understanding)

This is knowledge-grounded generation - the gold standard for trustworthy AI assistance.

Tool Deep Dive

search_documentation

What it does: Hybrid semantic + graph search across all workflow documentation:

  • Vector search: Find semantically similar content using embeddings
  • Graph enrichment: Include related nodes from the knowledge graph
  • Multi-source: Search across code comments, markdown docs, configuration files
  • Ranked results: Return top N most relevant matches with similarity scores

Why it matters: Developers don't think in keywords - they think in concepts. "How do I run a coupled forecast?" doesn't contain the exact terms in the documentation. Semantic search understands the intent and finds relevant docs even when terminology differs.

Agentic Enhancement: The LLM can now research autonomously. When asked a question, it searches documentation, synthesizes multiple sources, and provides comprehensive answers grounded in authoritative text.

Real-world example: Developer asks "how do I configure ocean coupling?" Semantic search finds relevant sections across multiple docs (UFS configuration, coupling setup, ocean model parameters) even though none use that exact phrase. The LLM synthesizes a complete answer from 5+ sources.

search_ee2_standards

What it does: Specialized search for NOAA EE2 (Enterprise Evolution 2) compliance standards:

  • Environment variable conventions
  • Error handling requirements
  • File naming standards
  • Production utility usage
  • Code quality standards

Why it matters: EE2 compliance is mandatory for operational deployment. Code that doesn't follow standards gets rejected. This tool ensures the LLM generates compliant code from the start, rather than requiring expensive rework.

Agentic Enhancement: The LLM becomes a compliance guardian. When generating code, it automatically checks standards. When reviewing code, it flags violations. This shifts compliance from "manual review" to "automated verification."

Real-world example: Developer asks "create a new job script." The LLM searches EE2 standards, finds requirements for error handling, environment variables, and logging, and generates a compliant script that passes review on first submission.

explain_with_context

What it does: Provide comprehensive explanations using the full RAG knowledge base:

  • Retrieve relevant documentation
  • Include graph context (related concepts, dependencies)
  • Synthesize multi-source information
  • Tailor detail level (basic, intermediate, advanced)
  • Add operational context (HPC-specific guidance)

Why it matters: Understanding complex systems requires more than isolated facts - it requires contextual synthesis. This tool enables the LLM to explain not just "what" but "why" and "how" by pulling together multiple knowledge sources.

Agentic Enhancement: Transforms the LLM from "answering machine" to "expert teacher." It doesn't just retrieve facts - it explains concepts with appropriate context, examples, and operational guidance.

Real-world example: Developer asks "explain the GSI analysis cycle." The tool retrieves docs about GSI, related graph nodes for observation processing and background fields, and synthesizes a comprehensive explanation covering theory, implementation, and operational considerations.

analyze_ee2_compliance

What it does: Automated compliance analysis for code or documentation:

  • Check environment variable usage (RUN, DATA, COMROOT, etc.)
  • Verify error handling patterns (errexit, error trapping)
  • Validate file naming conventions
  • Check production utility usage (setpdy.sh, etc.)
  • Assess code standards (style, documentation)

Why it matters: Manual compliance checking is slow, error-prone, and happens too late (after code is written). Automated analysis enables shift-left compliance - catching issues during development rather than during review.

Agentic Enhancement: The LLM can now self-check generated code before presenting it to the developer. This dramatically reduces review cycles and ensures higher quality initial submissions.

Real-world example: Developer generates a new script. The tool automatically analyzes it, finds missing error handling and incorrect environment variable usage, and the LLM revises the code before showing it to the developer.

generate_compliance_report

What it does: Comprehensive compliance assessment across multiple categories:

  • Environment variables: usage, conventions, consistency
  • Workflow structure: directory layout, file organization
  • Error handling: errexit, error traps, exit codes
  • File naming: conventions, patterns, conflicts
  • Production utilities: usage, versions, deprecation status
  • Code standards: style, documentation, complexity
  • Directory structure: layout compliance, naming conventions

Why it matters: Before code goes to production, it needs comprehensive compliance verification. This tool provides that verification, generating detailed reports that document compliance status and identify all gaps.

Agentic Enhancement: Enables compliance-driven development. The LLM can generate compliance reports proactively, track compliance debt, and guide developers toward compliant implementations.

Real-world example: Developer preparing for production deployment asks "generate compliance report for my branch." The tool analyzes all changes, produces a checklist of compliance items, flags 3 violations, and provides specific remediation guidance.

get_knowledge_base_status

What it does: System health check for the RAG infrastructure:

  • ChromaDB status: collections, document counts, API connectivity
  • Neo4j status: graph database connectivity, node counts, relationship counts
  • File system status: knowledge base files, ingestion status
  • Version tracking: architecture version, tool versions, data versions

Why it matters: RAG systems have external dependencies (vector database, graph database). Before relying on RAG tools, the LLM needs to verify the knowledge base is available and healthy. This tool provides that verification.

Agentic Enhancement: Enables self-aware AI. The LLM can check its own knowledge base status, report degraded capabilities, and fall back to non-RAG tools when databases are unavailable.

Real-world example: Developer asks a question requiring semantic search, but ChromaDB is down. The LLM checks status, detects the outage, informs the user, and falls back to filesystem-based tools instead of failing with cryptic errors.

find_related_files

What it does: Vector-based file similarity and relationship discovery:

  • Find files with similar import patterns
  • Discover files with related functionality
  • Identify potential refactoring candidates (duplicate logic)
  • Locate documentation related to code files
  • Configurable similarity threshold and result count

Why it matters: Large codebases have hidden similarity patterns - files that do similar things, duplicate logic, or share conceptual relationships. This tool surfaces those patterns, enabling better code reuse, refactoring, and knowledge discovery.

Agentic Enhancement: The LLM can now discover patterns autonomously. When asked "are there other files like this?", it finds similar files and suggests opportunities for abstraction, reuse, or learning from existing patterns.

Real-world example: Developer writes a new Python script for ocean data processing. The tool finds 3 existing files with similar patterns, and the LLM suggests "consider reusing the existing OceanDataProcessor class rather than duplicating this logic."


Category 4: OperationalTools - Deep Domain Intelligence

Purpose and Motivation

The Global Workflow isn't just code - it's an operational system running in production at NOAA, generating forecasts that impact lives and livelihoods. Operational systems have requirements beyond code correctness:

  • Platform specificity: HPC systems have unique constraints and configurations
  • Operational procedures: Running in production requires specific workflows
  • Job orchestration: Complex job dependencies and scheduling requirements
  • Troubleshooting context: Understanding failures requires operational knowledge

OperationalTools bridges the gap between "code that works" and "code that works in production." These tools provide the LLM with deep operational context that isn't available in generic training data.

The Value Proposition

These tools transform the LLM from "code generator" to "operational engineer." Instead of just writing code, Copilot can now:

  • Generate HPC-appropriate job scripts
  • Provide platform-specific troubleshooting guidance
  • Explain operational procedures and best practices
  • Navigate the complex job dependency graph

This is operational intelligence - understanding not just how code works, but how systems run in production.

Tool Deep Dive

list_job_scripts

What it does: Complete inventory and categorization of workflow job scripts:

  • By category: Analysis, forecast, post-processing, verification, archiving
  • By format: J-jobs (wrappers), ex-scripts (executables), utilities
  • With metadata: Purpose, dependencies, platform support, resource requirements
  • Multiple output formats: Summary, detailed, JSON

Why it matters: The Global Workflow has hundreds of job scripts with complex interdependencies. Understanding what jobs exist, how they categorize, and how they relate is crucial for:

  • Adding new jobs (where does it fit?)
  • Debugging workflows (which job failed?)
  • Understanding the system (what jobs run in a forecast cycle?)
  • Optimizing resources (what jobs are resource-intensive?)

Agentic Enhancement: The LLM can now navigate job space intelligently. When asked "show me all analysis jobs," it provides a curated list. When debugging a failure, it can identify the failed job and its dependencies. When adding a new job, it suggests where it should fit in the workflow.

Real-world example: Developer asks "what post-processing jobs run after the forecast?" The tool lists all post jobs, shows their dependencies on the forecast job, and explains what each post job produces. The LLM synthesizes this into a clear operational flow diagram.

explain_workflow_component

What it does: Deep, graph-enriched explanation of any workflow component:

  • Component description: Purpose, functionality, role in workflow
  • Dependencies: What this component depends on (upstream)
  • Dependents: What depends on this component (downstream)
  • Related components: Graph-based relationship discovery
  • Operational context: How this component runs in production
  • Configurable detail: Basic, detailed, or expert level explanations

Why it matters: Understanding a single component in isolation is insufficient - you must understand its role in the larger system. This tool provides that contextual understanding by combining filesystem analysis, graph relationships, and semantic context.

Agentic Enhancement: The LLM can now explain systems holistically. When asked about a component, it doesn't just describe the code - it explains the operational context, dependencies, and role in the forecast cycle.

Real-world example: Developer asks "explain the GDAS job." The tool retrieves the job script, finds its dependencies (dump, prep), identifies its downstream users (analysis, forecast), and synthesizes a comprehensive explanation: "GDAS processes observations for the Global Data Assimilation System. It depends on dump/prep jobs for observation data and feeds into the GSI analysis. Runs every 6 hours at 00z, 06z, 12z, 18z."

get_operational_guidance

What it does: HPC-specific operational procedures and best practices:

  • Platform-specific: Tailored guidance for Hera, Orion, WCOSS2, Jet, Gaea
  • Operational scenarios: Job submission, monitoring, debugging, optimization
  • Urgency-aware: Different guidance for routine, urgent, emergency situations
  • Procedure documentation: Step-by-step operational procedures
  • Best practices: HPC-specific tips, tricks, and common pitfalls

Why it matters: HPC systems are complex and platform-specific. What works on Hera may not work on WCOSS2. Operational procedures that are safe in development may be dangerous in production. This tool provides the institutional knowledge needed to operate safely and effectively.

Agentic Enhancement: The LLM becomes an operational advisor. When a developer encounters an HPC issue, the LLM provides platform-specific guidance. When submitting a job, it suggests appropriate parameters. When debugging, it recommends platform-appropriate troubleshooting steps.

Real-world example: Developer asks "how do I debug a stuck job on Orion?" The tool provides Orion-specific guidance: check Slurm queue with squeue -u $USER, examine job logs in /scratch/$USER/logs, use scancel to kill, check node status with sinfo. The LLM formats this into a clear troubleshooting workflow.


Category 5: GitHubTools - Repository Intelligence

Purpose and Motivation

The Global Workflow is developed across multiple repositories with active development, issue tracking, and pull request workflows. Understanding this ecosystem requires:

  • Multi-repository awareness: Global Workflow, UFS Weather Model, GSI, GFS Utils
  • Issue tracking: What problems have been reported? What solutions exist?
  • Pull request tracking: What changes are in flight? What's being reviewed?
  • Cross-repository dependencies: How do changes in one repo impact others?

GitHubTools integrates the development ecosystem into the MCP server, giving the LLM access to project tracking and repository intelligence.

The Value Proposition

These tools transform the LLM from "code assistant" to "project collaborator." Instead of working in isolation, Copilot can now:

  • Search for existing solutions to problems (issue search)
  • Track ongoing work and avoid conflicts (PR tracking)
  • Understand cross-repository impacts (dependency analysis)
  • Navigate multi-repository architectures (structure analysis)

This is collaborative intelligence - the ability to work within the context of an active development project.

Tool Deep Dive

analyze_workflow_dependencies

What it does: Graph-based dependency analysis across workflow components:

  • Upstream dependencies: What does this component depend on?
  • Downstream dependencies: What depends on this component?
  • Circular dependencies: Detect and report circular dependency chains
  • External dependencies: Include dependencies from external repos (UFS, GSI)
  • Visualization data: Generate dependency graphs for visualization

Why it matters: Workflow systems have complex dependency graphs. Before making changes, you must understand the ripple effects. This tool makes dependencies explicit and analyzable.

Agentic Enhancement: The LLM can perform impact analysis before suggesting changes. "If I modify this component, what breaks?" The tool provides the answer, enabling safer, more informed development decisions.

Real-world example: Developer wants to refactor the observation preprocessing code. The tool analyzes dependencies, finds that 12 downstream jobs depend on the current interface, and the LLM warns: "This change will impact 12 jobs - consider backwards compatibility or plan coordinated updates."

search_issues

What it does: Search GitHub issues across workflow repositories:

  • Cross-repository search: Search Global Workflow, UFS Weather Model, GSI, etc.
  • Label filtering: Filter by bug, enhancement, documentation, etc.
  • State filtering: Open, closed, or all issues
  • Semantic search: Find issues by concept, not just keywords

Why it matters: Before implementing a feature or fixing a bug, check if someone already did it. This tool prevents duplicate work by searching the issue tracker for existing discussions, solutions, and workarounds.

Agentic Enhancement: The LLM can learn from project history. When asked to implement a feature, it first searches issues to see if it's been discussed. When debugging, it searches for similar reported bugs. This grounds development in project knowledge.

Real-world example: Developer reports "forecasts fail on Gaea with MPI error." The LLM searches issues, finds a closed bug from 6 months ago with the same error, and provides the fix: "This was issue #1234 - resolved by updating the MPI module version in the Gaea config."

get_pull_requests

What it does: Retrieve pull request information and changes:

  • PR metadata: Title, author, status, labels, reviewers
  • Change summary: Files changed, lines added/removed
  • Review status: Approved, changes requested, pending
  • State filtering: Open, closed, merged, or all PRs

Why it matters: Active development happens through PRs. Understanding what's in flight prevents merge conflicts and duplicated effort. This tool provides visibility into ongoing work.

Agentic Enhancement: The LLM becomes project-aware. Before starting work, it checks for existing PRs. Before suggesting a change, it verifies it won't conflict with in-flight work. This enables collaborative development.

Real-world example: Developer starts working on ocean coupling improvements. The LLM checks PRs, finds an open PR with overlapping changes, and alerts: "PR #789 by @scientist-joe is already working on ocean coupling - consider coordinating to avoid conflicts."

analyze_repository_structure

What it does: Multi-repository structure analysis:

  • Repository inventory: List all related repositories
  • Structure comparison: Compare directory layouts across repos
  • Component mapping: Map components to their repositories
  • Shallow vs deep analysis: Quick overview or detailed structure

Why it matters: The Global Workflow is an ecosystem, not a monolith. Understanding how components are distributed across repositories is crucial for navigation and development.

Agentic Enhancement: The LLM can navigate multi-repository projects. When asked "where is the ocean model?", it knows to look in the UFS Weather Model submodule. When asked "how are repositories organized?", it provides a structural overview.

Real-world example: New developer asks "where do I find the atmospheric physics?" The tool analyzes repository structure, identifies that physics codes are in the UFS Weather Model submodule under FV3/atmos_phys, and the LLM provides a clear navigation path.


Integration: The Synergistic Platform

How Categories Work Together

The true power of this architecture emerges from category synergy - tools from different categories working together:

Example 1: Feature Development

  1. Developer asks "add a new ocean analysis job"
  2. WorkflowInfoTools provides system structure → knows where job scripts go
  3. OperationalTools lists existing analysis jobs → understands patterns to follow
  4. SemanticSearchTools retrieves EE2 standards → knows compliance requirements
  5. CodeAnalysisTools analyzes similar jobs → identifies code patterns to reuse
  6. GitHubTools checks for related PRs → avoids conflicts with ongoing work

Example 2: Bug Investigation

  1. Developer reports "forecast job failing on WCOSS2"
  2. OperationalTools explains the forecast job → understands what it should do
  3. CodeAnalysisTools traces execution path → finds where failure occurs
  4. GitHubTools searches issues → finds similar reported bugs
  5. SemanticSearchTools retrieves platform docs → provides WCOSS2-specific context
  6. WorkflowInfoTools checks system config → verifies platform configuration

Example 3: Code Refactoring

  1. Developer wants to refactor data processing module
  2. CodeAnalysisTools analyzes structure → understands current design
  3. CodeAnalysisTools finds dependencies → identifies all usage sites
  4. SemanticSearchTools finds related files → discovers similar patterns
  5. OperationalTools explains operational context → ensures operational continuity
  6. GitHubTools analyzes repo dependencies → checks cross-repo impacts

The Agentic Development Platform

These 21 tools, working in concert, create something greater than the sum of their parts: a fully-functional integrated agentic software development platform. The LLM is no longer just a code generator - it's an autonomous development agent with:

  1. Structural Awareness (WorkflowInfoTools) - understands system architecture
  2. Relationship Intelligence (CodeAnalysisTools) - understands code connections
  3. Knowledge Grounding (SemanticSearchTools) - understands domain knowledge
  4. Operational Context (OperationalTools) - understands production requirements
  5. Collaborative Intelligence (GitHubTools) - understands project dynamics

This transforms the developer experience from:

  • "Generate this code" → "Develop this feature"
  • "Fix this bug" → "Investigate and resolve this issue"
  • "Explain this function" → "Teach me this system"
  • "Review this code" → "Ensure this meets all standards"

Beyond Code Completion

Traditional AI coding assistants provide code completion - they autocomplete lines and functions based on context. The MCP RAG platform provides autonomous development assistance:

  • Research capability: Find information across docs, code, and issues
  • Analysis capability: Understand structures, dependencies, and impacts
  • Generation capability: Create compliant, contextual, operational code
  • Verification capability: Check compliance, correctness, and best practices
  • Explanation capability: Teach concepts with full context and examples

This is the future of software development: AI agents that don't just write code, but understand systems, follow standards, and collaborate effectively.


Performance and Architecture

Design Principles

  1. Modularity: Each category is independent - can be enabled/disabled separately
  2. Progressive Enhancement: Core tools work without databases; RAG tools add intelligence
  3. Fail Gracefully: Database outages don't crash the server - tools degrade gracefully
  4. Fast First: Static tools provide instant responses; semantic tools are worth the wait
  5. Data Separation: Filesystem, graph database, and vector database each serve distinct purposes

Performance Characteristics

Tool Category Typical Response Time Dependencies
WorkflowInfoTools <10ms Filesystem only
CodeAnalysisTools 50-200ms Neo4j graph database
SemanticSearchTools 200-500ms ChromaDB vector database
OperationalTools 100-300ms Filesystem + Graph + Vector
GitHubTools 500-2000ms GitHub API (network dependent)

Scalability

The architecture scales through:

  • Caching: Frequent queries cached at multiple levels
  • Lazy Loading: Tools initialized only when first used
  • Batch Processing: Multiple queries batched when possible
  • Index Optimization: Graph and vector databases properly indexed
  • Connection Pooling: Database connections reused efficiently

Future Directions

Planned Enhancements

  1. Multi-modal RAG: Extend semantic search to code execution traces and runtime logs
  2. Active Learning: Tools that learn from developer feedback to improve recommendations
  3. Proactive Suggestions: Tools that suggest improvements without being asked
  4. Integration Testing: Tools that verify changes against test suites
  5. Documentation Generation: Tools that generate docs from code and context

Research Opportunities

  1. Code-aware embeddings: Specialized embeddings that understand code semantics
  2. Graph neural networks: Use GNNs for better dependency and impact prediction
  3. Workflow optimization: ML models that optimize job scheduling and resource allocation
  4. Automated compliance repair: Tools that fix compliance violations automatically
  5. Conversational debugging: Multi-turn debugging conversations with context retention

Conclusion

The Global Workflow MCP server's 21-tool architecture represents a comprehensive approach to AI-assisted development. By organizing tools into five functional categories and enabling them to work synergistically, we've created a platform that transforms LLMs from code generators into autonomous development agents.

This is more than tooling - it's a new paradigm for software development:

  • From isolated coding to system-aware development
  • From manual compliance checking to automated verification
  • From keyword search to semantic understanding
  • From individual files to holistic architecture
  • From code generation to operational engineering

The result is faster development, higher quality code, better compliance, and more effective knowledge transfer - all while empowering developers to work at a higher level of abstraction.

The future of software development isn't just writing code faster - it's understanding systems better. These 21 tools make that future real.


Architecture Version: 3.0.0 (Week 2 Consolidated)
Tool Count: 21 (3 + 4 + 7 + 3 + 4)
Implementation: Node.js with MCP Protocol
Databases: Neo4j (graph), ChromaDB (vector)
Deployment: stdio transport for VS Code integration