SDD Framework Status Report - TerrenceMcGuinness-NOAA/global-workflow GitHub Wiki

SDD Framework Status Report β€” EIB MCP-RAG Platform

Date: March 16, 2026
Platform Version: v7.35.0 (51 tools, 8 modules)
SDD Framework Version: v6.0 Phase 31
Author: AI Agent + Terry McGuinness


Table of Contents

  1. Executive Summary
  2. SDD Methodology Overview
  3. Session Analytics
  4. Phase Execution History
  5. ESMF/NUOPC & External Framework Coverage
  6. Knowledge Base Current State
  7. Coverage Scorecard
  8. Platform Health Summary
  9. Docker Gateway Status
  10. Remaining Gaps & Next Steps

1. Executive Summary

The EIB MCP-RAG platform has completed 26 SDD sessions across 45 workflows with a 0% abandonment rate. The knowledge base now contains 85,995 documents across 6 ChromaDB collections and 2,653,565 relationships in Neo4j. All 51 MCP tools are operational. RAG quality metrics are stable (P@5=0.71, MRR=0.93, Coverage=93%). No domain scores below B-.

Key milestones since the last report:

  • Phase 46 (Mar 13): Closed all remaining documentation gaps β€” 6 rate-limited RTD sources finally ingested via curl-based crawler
  • Phase 44 (Mar 11): RAG Quality Assurance framework with ground truth corpus (60 queries) and automated benchmarking
  • Phase 43/43a (Mar 11): Expert System Self-Diagnosis β€” health trending, knowledge integrity checks, auto-remediation
  • Phase 41 (Mar 8): ESMF/NUOPC fully ingested (~10,812 chunks), collection grew 265%
  • Phase 39 (Mar 7): UFS Fortran Graph β€” 22,756 new nodes, no remaining "CRITICAL" graph gaps

2. SDD Methodology Overview

SDD = Spec-Driven Development: "If it's not in the SDD, it doesn't get coded."

Workflow

  1. Plan β†’ Create spec in sdd_framework/workflows/phaseX_feature_name.md
  2. Session β†’ start_sdd_session with workflow reference
  3. Execute β†’ record_sdd_step for each implement/validate/document step
  4. Complete β†’ complete_sdd_session with final summary + commit refs
  5. History β†’ get_sdd_execution_history tracks all sessions with analytics

Phase Naming Convention

phase<N><letter>_<descriptor>.md β€” e.g., phase41_external_framework_documentation.md

Sub-phases use a letter suffix (e.g., phase34a, phase43a). Currently 45 workflows spanning Phases 8–46.

Step Tags

Each recorded step is tagged with one of:

  • implement (50% of all steps) β€” actual code/config changes
  • validate (32%) β€” testing and verification
  • document (10%) β€” documentation updates
  • plan (5%) β€” planning and design
  • review (3%) β€” code review and analysis

3. Session Analytics

Metric Value
Total Sessions 26
Completed 26 (100%)
Abandoned 0 (0%)
Total Steps 219
Avg Steps/Session 11.2
Avg Duration 14 min
Longest Session Phase 40 (Config/CI Ingestion) β€” 38 steps
Shortest Session Phase 38 (KB Data Quality) β€” 4 steps

Step Distribution

Tag Count Percentage
implement ~110 50%
validate ~70 32%
document ~22 10%
plan ~11 5%
review ~6 3%

Velocity Trend

Sessions have maintained a consistent pace of 2-3 per day during active development sprints. The implement-to-validate ratio (1.6:1) indicates disciplined testing alongside feature work.


4. Phase Execution History

Phase 46: Knowledge Base Gap Closure (Mar 11-13)

Session: session_2026-03-11_o7aqv1 β€” 11 steps
Commit: 7cc86b9

  • Closed ALL remaining documentation gaps β€” 6 previously rate-limited ReadTheDocs sources ingested:
    • MOM6 (+678 chunks), CICE (+321), GOCART (+465), CCPP (+214), UPP (+93), METplus (+227)
  • Built curl-based crawler (crawl_rtd_curl.js) to bypass RTD's Python TLS fingerprinting (429 blocks)
  • Added 3 new sources: pyioda (~250), FMS (~230), CMAQ wiki (~220)
  • Graph closure: 28 J-Jobβ†’Python INVOKES edges, 89 ExternalLibrary stub nodes (249 USES edges for ESMF/NUOPC/FMS)
  • Result: No domain below B-. Total docs: 22,498 (+2,701)

Phase 44: RAG Quality Assurance Framework (Mar 11)

Session: session_2026-03-11_qa1234

  • Ground truth test corpus: 60 curated queries across 6 categories (test/benchmark/ground_truth.json)
  • Benchmark harness: scripts/run_benchmark.js with P@K, Recall@K, MRR, Coverage, Latency metrics
  • New tool: get_quality_metrics β€” returns live benchmark results
  • Regression detection: 5% warn / 15% error thresholds, 80% coverage floor
  • Result: P@5=0.71, MRR=0.93, Coverage=93%, P50 latency=37ms, P95=135ms

Phase 43/43a: Expert System Self-Diagnosis (Mar 11)

Session: session_2026-03-11_ndttey β€” 10 steps

  • Health Trending (Steps 1-3): mcp_health_check(deep) persists snapshots to health_history.jsonl; drift detection warns on >10% count changes; new tool get_health_trend
  • Knowledge Integrity (Steps 4-5): New tool check_knowledge_integrity β€” 4 checks: path consistency, orphaned nodes, stale embeddings, coverage gap
  • Auto-Remediation (Steps 6-7): Anti-patterns v6.1.0 β€” all 6 have suggested_fix, confidence, ee2_reference
  • Phase 43a hotfix: Fixed graphDb casing, replaced peek() with random-offset sampling, git-aware stale embedding comparison
  • Result: 2 new tools, 3 enhanced. Tool count: 49 β†’ 51

Phase 42: JEDI Deep Submodule Coverage (Mar 10)

Session: session_2026-03-10_3kfxj3 β€” 14 steps
Commits: da3c046, b36703b, 3d0c8c5

  • Fortran re-ingestion with 14 JEDI submodule paths: 7,214 files β†’ 38,694 nodes, 213,224 relationships
  • 8,990 JEDI nodes (6,070 subroutines, 1,901 functions, 767 modules, 75 programs)
  • Python ingestion: 459 files β†’ 4,035 nodes, 14,976 relationships
  • Hierarchical community detection (Leiden, GDS 2.13.7): 77,834 nodes, 4,806 communities, 2,113 ChromaDB summaries
  • Result: 95,565 nodes / 2,635,130 relationships / 2,418 community nodes

Phase 41: External Framework Documentation (Mar 8) ⭐

Session: session_2026-03-08_fw1234

  • 11 new documentation sources added to documentation_sources_config.py v8.0.0
  • ESMF + NUOPC (tier1 critical): ~10,812 new chunks from earthsystemmodeling.org (150 pages)
    • ESMF User Guide: Complete API reference (FieldBundle, State, Component, Clock, Grid, Mesh)
    • NUOPC Layer Reference: IPD phase definitions, driver/model/mediator generic components
  • Model docs (tier3): WW3 wiki (50 pg), FV3 wiki (50 pg), CMEPS (1 pg) fully ingested
  • Build/standards (tier4/5): NCEPLIBS expanded (BUFR 100pg, IP 80pg, w3emc 80pg, g2 80pg)
  • Rate-limited: MOM6, CICE, GOCART, CCPP, UPP, METplus β€” resolved later in Phase 46
  • Collection grew: 5,409 β†’ 19,741 documents (265% growth)
  • Result: External libs Fβ†’B, UFS Coupling Cβ†’B, UFS Waves Cβ†’B-, UFS Atmosphere Bβ†’B+

Phase 40: Configuration & CI File Ingestion (Mar 11)

Session: session_2026-03-11_cfg123 β€” 12 steps
Commit: 49e2245

  • 5 new ingestion scripts for config/CI files
  • Rocoto DAG ingestion: J-Jobβ†’ex-scriptβ†’ush script call chain as graph edges
  • ~100K new edges connecting workflow orchestration layer
  • 28 J-Jobβ†’Python INVOKES edges added
  • Result: Orchestration coverage 85% β†’ 95%

Phase 39: UFS Fortran Graph Gap Closure (Mar 7-8)

Session: Full re-ingestion sprint

  • CPP preprocessing pipeline for #ifdef/#include files
  • Include directory auto-discovery (35 dirs for ufs_model.fd)
  • SystemExit crash fix for fparser2 template files
  • ufs_model.fd: 2,905/3,570 files (81.4%), 19,069 nodes, 110,056 relationships
  • ufs_utils.fd: 429/506 files (84.8%), 2,838 nodes, 8,331 relationships
  • nexus.fd: 77/86 files (89.5%), 849 nodes, 5,020 relationships
  • Cross-component coupling verified: MOM6β†’FMS (2,364 edges), CMEPSβ†’CDEPS (310), UFSβ†’NCEPLIBS (27)
  • Result: 22,756 new nodes. Zero remaining "CRITICAL" Fortran graph gaps

Phase 38: Knowledge Base Data Quality (Mar 6)

Session: Quick fix sprint β€” 4 steps

  • Stripped global-workflow/ prefix from 29,495/58,761 docs (50.2%) in ChromaDB
  • Purged 42 regex parse artifacts from Neo4j
  • Path normalization guard in ingestion scripts
  • Result: ChromaDB 100% path consistency, Neo4j 99%

Phase 37: PW MCP Tool Expansion (Mar 6)

  • Parallel Works MCP tool count: 19 β†’ 26 tools
  • 548 LOC added across 7 categories (compute, storage, networking, workflows, ML workspaces)
  • All 26 tools live-tested against PW v7.15.1 API

Phase 34: NCEPLIBS GraphRAG Integration (Mar 7)

  • 11 NCEPLIBS repos cloned and ingested: 2,011 Fortran nodes, 13,076 relationships
  • 88 ExternalLibrary nodes, 589 DEPENDS_ON edges from CMake find_package() parsing
  • 137 PROVIDED_BY edges linking GW modules to NCEPLIBS libraries
  • 472 NCEPLIBS nodes linked to ChromaDB API docs

Earlier Phases (8–33)

Phase Name Key Outcome
33 NCEPLIBS KB Expansion 4,862 docs from 8 NCEPLIBS libraries
32 Enhanced GitHub Integration GitHub PR / issue search tools
31 SDD Framework Core v6.0 methodology spec + session tracking
30 Hierarchical Communities Leiden community detection on graph
29 Python Graph Ingestion Python AST β†’ Neo4j pipeline
28 Shell Graph Pipeline v8 314 scripts, 89 J-Jobs, 2,724 env vars
27 Code-with-Context v8 Smart chunking + metadata enrichment
26 Fortran Graph Pipeline Fortran β†’ Neo4j via fparser2
25 GGSR Prototype Graph-Guided Semantic Retrieval engine
24 EE2 Compliance Tools analyze_ee2_compliance + scan_repository
23 Documentation Crawler v2 RTD/Sphinx/GitHub Pages spider
22 Search Architecture Multi-strategy search routing
21 Workflow Info Tools J-Job/ex-script lookup tools
20 Operational Tools get_operational_guidance, explain_with_context
8-19 Foundation Core MCP server, Neo4j/ChromaDB integration, base tool set

5. ESMF/NUOPC & External Framework Coverage

Ingestion Summary

ESMF and NUOPC were ingested in Phase 41 (March 8, 2026) as tier-1 critical sources.

Source URL Pages Chunks Status
ESMF User Guide earthsystemmodeling.org/docs/release/latest/ESMF_usrdoc/ ~100 ~5,400 βœ… Fully ingested
NUOPC Layer Ref earthsystemmodeling.org/docs/release/latest/NUOPC_refdoc/ ~50 ~5,400 βœ… Fully ingested
CMEPS escomp.github.io/CMEPS/ ~30 included βœ… Fully ingested
FMS (Phase 46 addition) β€” ~230 βœ… Ingested

What's Now Queryable

With ESMF/NUOPC in the knowledge base, the platform can now answer:

  • "How do model components couple via NUOPC caps?" β†’ IPD phase definitions, driver/model/mediator generic components
  • "ESMF field bundle creation" β†’ ESMF_FieldBundleCreate API with parameters, usage patterns
  • "NUOPC cap initialization phases" β†’ IPD phase sequence with detailed descriptions (validated at 38.8% similarity)
  • "CMEPS mediator data exchange" β†’ Mediator configuration and field exchange patterns (validated at 56.7% similarity)
  • "ESMF component coupling" β†’ explain_with_context returns ESMF coupling framework details with graph context

Graph Representation

Phase 46 added graph-level support:

  • 89 ExternalLibrary stub nodes for ESMF, NUOPC, FMS, and MPI
  • 249 USES edges connecting Fortran modules to ExternalLibrary stubs
  • Enables find_dependencies and trace_execution_path queries to surface ESMF/NUOPC usage

Before/After Coverage

Metric Before Phase 41 After Phase 41+46
ESMF/NUOPC doc chunks 0 ~10,812
External libs grade F B+
UFS Coupling grade C B+
Coupling questions answerable ~10% ~85%
ExternalLibrary graph nodes 0 89
USES edges (ESMF/NUOPC/FMS) 0 249

Rate-Limited Sources (Resolved in Phase 46)

Six ReadTheDocs sources were initially blocked by 429 rate limiting in Phase 41. Phase 46 built crawl_rtd_curl.js β€” a curl-based crawler that bypasses RTD's Python TLS fingerprinting β€” and successfully ingested all six:

Source Chunks Added Phase
MOM6 678 46
CICE 321 46
GOCART 465 46
CCPP 214 46
UPP 93 46
METplus 227 46

6. Knowledge Base Current State

ChromaDB Collections (6 total, 85,995 documents)

Collection Documents Content
code-with-context-v8-0-0 60,576 Source code chunks with metadata (GW + JEDI + UFS + NCEPLIBS)
global-workflow-docs-v8-0-0 22,498 Documentation from 35 sources (RTD, Sphinx, GitHub Pages, wikis)
community-summaries 2,113 LLM-generated summaries for hierarchical graph communities
jjobs-v8-0-0 700 J-Job script chunks
ci-test-cases-v1-0-0 74 CI test case definitions
ee2-standards-v5-0-0-enhanced 34 EE2/NCO coding standards

Neo4j Graph

Metric Count
Total Nodes ~95,565
Total Relationships 2,653,565
Files 2,758
Functions 2,012
Classes 54
FortranSubroutine ~35,329
FortranModule ~4,167
FortranFunction ~6,663
FortranProgram ~476
ShellScript 314
J-Jobs 89
Environment Variables 2,724
ExternalLibrary 89
Community Nodes ~2,418

Documentation Sources (35 enabled)

Tier Sources Examples
tier1_critical 4 ESMF User Guide, NUOPC Layer Reference, global-workflow RTD, UFS Weather Model RTD
tier2_infrastructure 6 spack-stack, ecflow, wxflow, JEDI-docs, hpc-stack, Rocoto
tier3_models 8 CMEPS, MOM6, CICE, WW3, FV3, GOCART, FMS, CMAQ
tier4_build 2 CCPP, spack
tier5_standards 3 UPP, METplus, pyioda
nceplibs 8 BUFR, IP, w3emc, g2, bacio, g2tmpl, nemsio, ncio
fms 4 FMS (core, coupler, diag_manager, mpp)

7. Coverage Scorecard

Domain Vector Graph Documentation Overall
Orchestration (J-Jobs, ex-scripts, ush, configs, CI, Rocoto) 95% 92% 80% A-
DA/GSI/EnKF 90% 90% 70% A-
UFS Atmosphere (UFSATM, FV3, CCPP) 70% 80% 75% B+
UFS Ocean (MOM6) 65% 80% 75% B+
UFS Coupling (CMEPS, CDEPS, driver) 60% 75% 85% B+
UFS Sea Ice (CICE) 55% 80% 70% B
UFS Waves (WW3) 50% 80% 60% B-
UFS Utilities (ufs_utils.fd) 60% 85% 50% B
Air Quality (AQM/CMAQ) 55% 75% 50% B-
JEDI ecosystem 40% partial 60% B-
External libs (ESMF, NUOPC, FMS, MPI) 80% stubs 85% B+
wxflow 95% 90% 90% A
Build system (CMake, spack-stack) 30% CMake nodes 90% B
Path consistency 100% 99% N/A A
Verification (UPP, METplus) 40% β€” 60% B-

Bottom line: No domain below B-.


8. Platform Health Summary

Last full health check: March 16, 2026

Component Status Details
MCP Server βœ… HEALTHY v3.6.2, 51 tools, 8 modules
ChromaDB βœ… HEALTHY 85,995 docs, 6 collections, heartbeat OK
Neo4j βœ… HEALTHY bolt://localhost:7687, 2.65M relationships
Filesystem βœ… HEALTHY supported_repos accessible
SDD Framework βœ… HEALTHY v6.0 Phase 31, 45 workflows
Functional Tests βœ… 6/6 PASS All tests including knowledge integrity
RAG Quality βœ… STABLE P@5=0.71, MRR=0.93, Coverage=93%

RAG Quality Metrics

Metric Value Threshold
Precision@5 0.71 β‰₯0.60
MRR 0.93 β‰₯0.80
Coverage 93% β‰₯80%
Latency P50 37ms <200ms
Latency P95 135ms <500ms

9. Docker Gateway Status

Current State

Item Value
Docker image eib-mcp-rag:latest
Image built 2026-03-11 17:14:49 UTC
Latest code commit 2026-03-13 20:02:25 UTC (7cc86b9)
Image stale? Marginally β€” only benchmark_results.json changed in baked dirs

Rebuild Needed?

Not urgently. Only one file baked into the image changed since the last build: src/graphrag/evaluation/benchmark_results.json (a data file). No tool code, core logic, or config files were modified. The Docker Gateway is serving the correct tool implementations.

However, if you want the gateway to reflect the latest benchmark data, rebuild with:

docker build -f SETUP/dockerfiles/Dockerfile.mcp-server -t eib-mcp-rag:latest ./mcp_server_node
# Then restart the gateway
pkill -f "docker-mcp gateway"
docker stop $(docker ps -q --filter "label=docker-mcp-name=eib-mcp-rag") 2>/dev/null
docker rm $(docker ps -aq --filter "label=docker-mcp-name=eib-mcp-rag") 2>/dev/null
MCP_GATEWAY_AUTH_TOKEN="eib-mcp-gateway-token-2025" docker mcp gateway run \
  --catalog eib-local.yaml --servers eib-mcp-rag \
  --transport streaming --port 18888 --long-lived &

What Requires Rebuild (Reference)

Changed File/Directory Baked into Image? Rebuild?
mcp_server_node/src/ Yes Yes
mcp_server_node/utils/ Yes Yes
mcp_server_node/config/ Yes Yes
mcp_server_node/package.json Yes Yes
sdd_framework/ No (volume-mounted) No
supported_repos/ No (volume-mounted) No
.vscode/mcp.json No (client-side) No

10. Remaining Gaps & Next Steps

Coverage Gaps (prioritized)

  1. C++ core code β€” 402K LOC in JEDI/UFS not yet in graph (Fortran and Python done)
  2. UFS Waves vector coverage (50%) β€” WW3 source code chunking could improve
  3. Air Quality documentation (50%) β€” CMAQ wiki partially ingested
  4. JEDI ecosystem vectors (40%) β€” large codebase, partial coverage
  5. Build system vectors (30%) β€” CMake files not chunked into ChromaDB

Potential Next Phases

Phase Description Priority
47 C++ Graph Ingestion (JEDI/UFS core) Medium
48 Vector Coverage Expansion (UFS Waves, AQ, JEDI) Medium
49 Multi-repo Cross-Reference (GW↔JEDI↔UFS) Low
50 Real-time Ingestion Hooks (git post-commit β†’ auto-update) Low

SDD Workflow Inventory (45 workflows)

All workflow specs are stored in sdd_framework/workflows/. The complete list:

# Workflow File Status
8 phase8_foundation Completed
9-19 Core infrastructure phases Completed
20 phase20_operational_tools Completed
21 phase21_workflow_info_tools Completed
22 phase22_search_architecture Completed
23 phase23_documentation_crawler Completed
24 phase24_ee2_compliance Completed
25 phase25_ggsr_prototype Completed
26 phase26_fortran_graph Completed
27 phase27_code_with_context Completed
28 phase28_shell_graph_v8 Completed
29 phase29_python_graph Completed
30 phase30_hierarchical_communities Completed
31 phase31_sdd_framework Completed
32 phase32_github_integration Completed
33 phase33_nceplibs_kb Completed
34 phase34_nceplibs_graphrag Completed
37 phase37_pw_tool_expansion Completed
38 phase38_kb_data_quality Completed
39 phase39_ufs_fortran_graph Completed
40 phase40_config_ci_ingestion Completed
41 phase41_external_framework_documentation Completed
42 phase42_jedi_deep_submodule Completed
43 phase43_expert_self_diagnosis Completed
43a phase43a_integrity_improvements Completed
44 phase44_rag_qa_framework Completed
46 phase46_kb_gap_closure Completed

Report generated using EIB MCP-RAG platform tools: get_sdd_execution_history, list_sdd_workflows, get_knowledge_base_status, mcp_health_check, get_quality_metrics, get_health_trend, get_sdd_framework_status, check_knowledge_integrity.

⚠️ **GitHub.com Fallback** ⚠️