Parallel Works RDHPCS Platform Dashboard - TerrenceMcGuinness-NOAA/global-workflow GitHub Wiki

Parallel Works RDHPCS Hybrid Cloud β€” Platform Dashboard

Report Generated: March 6, 2026 via PW MCP Toolset (26 tools) Data Source: Live API queries against noaa.parallel.works (v7.15.1) User Context: Terry.McGuinness ([email protected]) | Organization: noaademo (NOAA)


Executive Summary

Metric Value
Platform RDHPCS Hybrid Cloud (Parallel Works v7.15.1)
Organization NOAA (noaademo) β€” 499 members
Total Clusters 35 (2 active, 33 off)
Terry's Clusters 7 (5 HPC proxied + 2 AWS cloud)
Active Sessions 1 (Desktop VNC on emcmcpawsrocky9functionalii)
Total Budget $945,412 across 8 groups
Total Spent $91,622.37 (9.7% consumed)
Remaining $853,789.63
Cloud Networks 4 (AWSΓ—2, GoogleΓ—1, AzureΓ—1)
Storage Resources 33 (buckets, NFS, Lustre, disks)
Snapshots 51 (platform) + 2 (user)
Static IPs 1 (GitLab server: 3.13.140.91)

1. Cost & Budget Analysis

1.1 Organization-Wide Budget Summary

Group Members Budget Used Remaining % Used Status
ca-ufs-cpldcld 17 $939,912 $87,319.38 $852,592.62 9.3% 🟒 Healthy
ca-sfs-emc 22 $3,525 $2,328.87 $1,196.13 66.1% 🟑 Watch
cz-ufs-cpldcld 14 $1,825 $1,824.50 $0.50 100% πŸ”΄ Exhausted
cg-ufs-cpldcld 19 $150 $149.62 $0.38 99.7% πŸ”΄ Exhausted
cg-sfs-emc 15 $0 $0 $0 β€” βšͺ No allocation
cg-ufs-cmd 6 $0 $0 $0 β€” βšͺ No allocation
cz-sfs-emc 15 $0 $0 $0 β€” βšͺ No allocation
cz-ufs-cmd 6 $0 $0 $0 β€” βšͺ No allocation
TOTAL 114 $945,412 $91,622.37 $853,789.63 9.7%

1.2 Budget Allocation by Cloud Provider

The group naming convention reveals the CSP allocation:

  • ca-* = Cloud AWS (2 groups: ca-ufs-cpldcld, ca-sfs-emc)
  • cg-* = Cloud Google (3 groups: cg-ufs-cpldcld, cg-sfs-emc, cg-ufs-cmd)
  • cz-* = Cloud Azure (3 groups: cz-ufs-cpldcld, cz-sfs-emc, cz-ufs-cmd)
Cloud Provider Groups Total Budget Used % Used
AWS 2 $943,437 $89,648.25 9.5%
Google 3 $150 $149.62 99.7%
Azure 3 $1,825 $1,824.50 100%

Key Insight: AWS accounts for 99.8% of total allocation ($943,437 of $945,412). The ca-ufs-cpldcld group alone holds $939,912 β€” the primary research computing budget for coupled cloud development. Google and Azure allocations are essentially exhausted.

1.3 Cost History & Observations

Observation Detail
Primary spending group ca-ufs-cpldcld β€” UFS Coupled Cloud Development on AWS
Highest burn rate ca-sfs-emc at 66.1% β€” approaching caution threshold
Fully consumed cz-ufs-cpldcld ($0.50 left) and cg-ufs-cpldcld ($0.38 left)
Zero-allocation groups 4 groups have $0 budgets (Google/Azure SFS & CMD)
Group creation timeline Oldest: ca-ufs-cpldcld (April 2021), Newest: ca-sfs-emc (Oct 2023)

Group Descriptions (NCEPDEV):

  • *-ufs-cpldcld β€” UFS Coupled Cloud Development (primary research workloads)
  • *-sfs-emc β€” SFS EMC (Seasonal Forecast System operations)
  • *-ufs-cmd β€” UFS Common Model Development

2. Compute Clusters

2.1 All Platform Clusters (35)

Active Clusters

Name Type CSP User Status Tags
emcmcpawsrocky9functionalii aws-slurm AWS Terry.McGuinness 🟒 active v2.0.0
awsmlxiao aws-slurm AWS Xiao.Luo 🟒 active ca-ufs-cpldcld, junwang

Terry McGuinness's Clusters (7)

HPC Proxied Clusters (On-Prem via PW Agent):

Name Display Name Type Created Agent Version Status
noaaheracluster Noaa Hera Cluster existing 2024-08-12 v5.173 πŸ”΄ off
noaajetcluster Noaa Jet Cluster existing 2024-08-12 unknown πŸ”΄ off
noaagaeac5cluster Noaa Gaea-c5 Cluster existing 2024-08-12 unknown πŸ”΄ off
noaagaeac6cluster NOAA Gaea C6 Cluster existing 2025-11-24 unknown πŸ”΄ off
noaaursaclustertmcg NOAA Ursa Cluster TMcG existing 2025-11-24 unknown πŸ”΄ off

Cloud Clusters (AWS):

Name Type CSP Status
emcmcpawsrocky9functionalii aws-slurm AWS 🟒 active
emcmcpawsrocky9copyfour aws-slurm AWS πŸ”΄ off

Hera Cluster Login Nodes:

hfe01.fairmont.rdhpcs.noaa.gov  hfe02.fairmont.rdhpcs.noaa.gov
hfe03.fairmont.rdhpcs.noaa.gov  hfe04.fairmont.rdhpcs.noaa.gov
hfe05.fairmont.rdhpcs.noaa.gov  hfe06.fairmont.rdhpcs.noaa.gov
hfe07.fairmont.rdhpcs.noaa.gov  hfe08.fairmont.rdhpcs.noaa.gov
hfe09.fairmont.rdhpcs.noaa.gov  hfe10.fairmont.rdhpcs.noaa.gov
hfe11.fairmont.rdhpcs.noaa.gov

Active controller: hfe05 (InternalIP: 10.184.6.62), last reported 2025-12-15

All Other Clusters (26)

Name Type CSP User Status
azclusternoaav2use1v3 azure-slurm Azure Linlin.Cui off
azureh100nd96isrlinlinv3 azure-slurm Azure Linlin.Cui off
azurea100linlinv3 azure-slurm Azure Linlin.Cui off
ca8h100east2av3 aws-slurm AWS Linlin.Cui off
gcpmllinlinoperv3 google-slurm Google Linlin.Cui off
gcpmllinlinv3 google-slurm Google Linlin.Cui off
azureh100nd96isrv3 azure-slurm Azure Sadegh.Tabas off
azh100v3 azure-slurm Azure Sadegh.Tabas off
awsdemov3 aws-slurm AWS Walter.Kolczynski off
sgp5cbuseast2v3 aws-slurm AWS Sadegh.Tabas off
sgp5cbuseast1res2v3 aws-slurm AWS Sadegh.Tabas off
sgp5cbuseast1v3 aws-slurm AWS Sadegh.Tabas off
pclusternoaav2use1v3 aws-slurm AWS Sadegh.Tabas off
mlgefsoperationalv3 aws-slurm AWS Sadegh.Tabas off
highmemv3 aws-slurm AWS Sadegh.Tabas off
ca8h100sadeghv3 aws-slurm AWS Sadegh.Tabas off
jwcaufscpldv3 aws-slurm AWS Jun.Wang off
junmlgefsopnv3 aws-slurm AWS Jun.Wang off
ca8h100east2ajunv3 aws-slurm AWS Jun.Wang off
globalworkflowspackstackv3 aws-slurm AWS Henry.Winterbottom off
globalworkflowdevelopv3 aws-slurm AWS Henry.Winterbottom off
cacmdv3 aws-slurm AWS Henry.Winterbottom off
gcpmlsadeghv3 google-slurm Google Sadegh.Tabas off
jwcufscpldcldv3 google-slurm Google Jun.Wang off
graphcastgfsoper google-slurm Google Linlin.Cui off
ca8h100east2axiao aws-slurm AWS Xiao.Luo off

2.2 Cluster Distribution by CSP

Cloud Provider Count Active Off
AWS (aws-slurm) 20 2 18
Azure (azure-slurm) 5 0 5
Google (google-slurm) 5 0 5
On-Prem (existing) 5 0 5
Total 35 2 33

2.3 Cluster Distribution by User

User Cloud Clusters HPC Clusters Total
Sadegh.Tabas 10 0 10
Linlin.Cui 7 0 7
Terry.McGuinness 2 5 7
Jun.Wang 5 0 5
Henry.Winterbottom 3 0 3
Walter.Kolczynski 1 0 1
Xiao.Luo 1 0 1
Ron.Millikan 1 0 1

3. Active Sessions

Field Value
Name marketplace.desktop.latest_1_session
Type Desktop VNC (noVNC)
Target Cluster emcmcpawsrocky9functionalii
Target CSP AWS
Status 🟒 running
Healthy βœ… true
Last Health Check 2026-03-05T15:56:11Z
Created 2026-03-05T15:26:36Z
Remote Host terrymcguinness-emcmcpawsrocky9functionalii-00002-mgmt
Remote Port 40135
Local Port 44163

4. Networking

4.1 VPC Networks (4)

Name CSP Regions Provisioning Mode DNS Zone Status
aws-us-east-1 AWS us-east-1 public pw-noaa-us-east-1.pw.local 🟒 provisioned
aws-us-east-2 AWS us-east-2 public pw-noaa-us-east-2.pw.local 🟒 provisioned
google-controller-nat Google us-central1, us-east4, us-west1-4, us-east1 controller-nat β€” 🟒 provisioned
azure-controller-nat Azure eastus2, northcentralus, southcentralus, eastus controller-nat β€” 🟒 provisioned

4.2 Static IPs (1)

Name IP Address CSP Region Description Tags
gitlabserver 3.13.140.91 AWS us-east-2 IP to GitLab server running in Docker on EC2 host provisioned with Rocky 8 gitlab

5. Storage Inventory

5.1 Object Storage β€” Buckets (11)

Provisioned (6) β€” Platform Buckets

Name Type CSP Region Owner
noaancepdevnonecaufscpldcld aws-bucket AWS us-east-1 noaamaster
noaancepdevnonecasfsemc aws-bucket AWS us-east-1 noaamaster
noaancepdevnonecgufscpldcld google-bucket Google us-central1 noaamaster
noaancepdevnonecgufscmd google-bucket Google us-central1 noaamaster
noaancepdevnonecgsfsemc google-bucket Google us-central1 noaamaster
ncepdevnoneczsfsemc azure-bucket Azure (default) noaamaster

Unprovisioned (5) β€” User Buckets

Name Type CSP Region Owner
weisfsemcawss3bucket aws-bucket AWS us-east-1 Wei.Huang
awssfshuangbucket4training aws-bucket AWS us-east-1 Wei.Huang
weigooglebucket google-bucket Google us-east1 Wei.Huang
weiazureblob azure-bucket Azure eastus Wei.Huang
azblobeast2 azure-bucket Azure eastus2 Linlin.Cui

5.2 Network File Systems β€” NFS/EFS/Filestore (7)

Name Display Name Type CSP Size Status Shared With
ncepdevnonecaufscpldcldcontrib ncepdev-none-ca-ufs-cpldcld-contrib aws-efs AWS β€” 🟒 provisioned ca-ufs-cpldcld
ncepdevnonecasfsemccontrib ncepdev-none-ca-sfs-emc-contrib aws-efs AWS β€” 🟒 provisioned ca-sfs-emc
ncepdevnonecgufscpldcldcontrib ncepdev-none-cg-ufs-cpldcld-contrib google-filestore Google 2048 GB 🟒 provisioned cg-ufs-cpldcld
ncepdevnonecgufscmdcontrib ncepdev-none-cg-ufs-cmd-contrib google-filestore Google 2048 GB 🟒 provisioned cg-ufs-cmd
ncepdevnonecgsfsemccontrib ncepdev-none-cg-sfs-emc-contrib google-filestore Google 2048 GB 🟒 provisioned cg-sfs-emc
googlenfs β€” google-filestore Google 2048 GB βšͺ unprovisioned cg-epic, cg-sfs-emc
weiepicawsnfs Wei EPIC AWS NFS aws-efs AWS β€” βšͺ unprovisioned ca-epic, ca-sfs-emc

5.3 High-Performance Storage β€” Lustre (14)

AWS FSx Lustre (11)

Name User Region Ephemeral Tags
hwufscmdephmerallustre Henry.Winterbottom us-east-2 βœ… Yes ca-ufs-cmd
fsxnoaav2use1 Linlin.Cui us-east-2 βœ… Yes β€”
weiepicawslustre Wei.Huang us-east-1 βœ… Yes Wei-EPIC-AWS-lustre
weiemcawslutre4t Wei.Huang β€” βœ… Yes linux
awssfshuanglustre4training Wei.Huang us-east-1 βœ… Yes linux
gcgfssadegh Linlin.Cui us-east-1 ❌ No AWS
awseast1f Linlin.Cui us-east-1 ❌ No β€”
mlwplustre Jun.Wang us-east-1 ❌ No mlwp, global forecast
awsgcgfs1east2a Xiao.Luo us-east-2 ❌ No east2a
awseast2a Linlin.Cui us-east-2 ❌ No β€”
mlsfslustreuseast1f Linlin.Cui us-east-1 ❌ No linux, mlsfs

Azure Managed Lustre (3)

Name User Region Status
weisfsczmanaged Wei.Huang eastus βšͺ unprovisioned
azmleast2 Linlin.Cui eastus2 βšͺ unprovisioned
azureepicweilutremanaged Wei.Huang eastus βšͺ unprovisioned

5.4 Persistent Disks (1)

Name CSP Region Size Ephemeral Status Owner Created
emceibmcpgraphragpersistenttwo AWS us-east-1 500 GB ❌ No 🟒 provisioned Terry.McGuinness 2026-02-20

This disk hosts the ChromaDB + Neo4j databases for the EIB MCP-RAG platform.

5.5 Snapshots

Terry's Snapshots (2):

Name CSP Region Size Created
emceibmcpgraphragpersistentthree AWS us-east-1 500 GB 2026-02-26
mdceibmcpgraphragpersistentthree AWS us-east-1 500 GB 2026-03-04

Platform Snapshots (49): Primarily apps-rocky8-* and apps-rocky9-* snapshots distributed across all three CSPs and multiple regions, used as base images for cluster provisioning.

Generation CSP Distribution Size Count
rocky8-legacy AWSΓ—2, GoogleΓ—7, AzureΓ—4 350 GB 13
rocky8-latest AWSΓ—2, GoogleΓ—7, AzureΓ—4 500 GB 13
rocky9-latest AWSΓ—2, GoogleΓ—7, AzureΓ—4 300 GB 13
pw-apps-rocky8-2025 AWSΓ—2, GoogleΓ—1, AzureΓ—2 128 GB 5
pw-apps-rocky9-2025 AWSΓ—2, GoogleΓ—1, AzureΓ—2 128 GB 5

5.6 Storage Summary by CSP

CSP Buckets NFS Lustre Disks Snapshots Total
AWS 3 2 11 1 15 32
Google 3 3 0 0 21 27
Azure 2 0 3 0 15 20
Total 8 5 14 1 51 79

6. Workflows & Marketplace

Name Display Name Type Source
marketplace.desktop.latest Desktop latest remote (GitHub) parallelworks/interactive_session (main)

The Desktop interactive session is the only workflow currently installed. It provides a noVNC remote desktop accessible through the PW platform.


7. ML Workspaces

Name CSP Region Network Status Created
mlresoursetest AWS us-east-1 aws-us-east-1 πŸ”΄ failed 2026-02-05

8. Platform Infrastructure

8.1 Platform Configuration

Setting Value
Platform Name RDHPCS Hybrid Cloud
Version v7.15.1
Status URL https://status-noaa.parallel.works/
Maintenance Mode ❌ No
Single Org Platform βœ… Yes (noaademo)
Auth Methods password
Features mlworkspace
Enforce Max TTL ❌ No
Terminal Theme dark (font 12)

8.2 User Features

Terry's account has the following platform features enabled:

  • mlworkspace β€” ML workspace provisioning
  • newProvisioner β€” v3 provisioner (newer cluster types)
  • newNotification β€” Updated notification system
  • v3 β€” Platform v3 features
  • cspAccounts β€” Multi-CSP account management

8.3 Available Sidebar Modules (22)

explorer, runs, sessions, workflows, clusters, instances, ipAddresses, kubernetes, machineLearning, aiChat, lustre, nfs, disks, buckets, snapshots, cost, reports, allocations, monitorInstances, jobs, marketplace, terminal


9. User Inventory β€” Terry.McGuinness

9.1 Compute

Resource Type CSP Status Purpose
emcmcpawsrocky9functionalii aws-slurm AWS 🟒 active MCP-RAG development VM
emcmcpawsrocky9copyfour aws-slurm AWS πŸ”΄ off Backup/staging
noaaheracluster existing (HPC) β€” πŸ”΄ off NOAA Hera (Fairmont, WV)
noaajetcluster existing (HPC) β€” πŸ”΄ off NOAA Jet (Boulder, CO)
noaagaeac5cluster existing (HPC) β€” πŸ”΄ off NOAA Gaea C5 (Oak Ridge)
noaagaeac6cluster existing (HPC) β€” πŸ”΄ off NOAA Gaea C6 (Oak Ridge)
noaaursaclustertmcg existing (HPC) β€” πŸ”΄ off NOAA Ursa

9.2 Storage

Resource Type Size Purpose
emceibmcpgraphragpersistenttwo aws-disk 500 GB ChromaDB + Neo4j databases
emceibmcpgraphragpersistentthree aws-snapshot 500 GB Database backup (Feb 26)
mdceibmcpgraphragpersistentthree aws-snapshot 500 GB Database backup (Mar 4)

9.3 Networking

Resource Type IP
gitlabserver static IP 3.13.140.91 (us-east-2)

9.4 Sessions

Session Target Status
Desktop VNC emcmcpawsrocky9functionalii 🟒 running (healthy)

9.5 Estimated Monthly Costs (Terry's Resources)

Resource Type Estimated Cost/Month Notes
emcmcpawsrocky9functionalii EC2 + Slurm ~$200-400/mo Active when running
emceibmcpgraphragpersistenttwo EBS 500GB ~$50/mo Persistent
gitlabserver IP Elastic IP ~$3.65/mo Static
Snapshots (2Γ—500GB) EBS Snapshot ~$10/mo Incremental
Total (estimated) ~$264-464/mo When active

Appendix A: Group Creation Timeline

Date Group CSP Description
2021-04-23 ca-ufs-cpldcld AWS UFS Coupled Cloud Dev
2023-08-10 cg-ufs-cmd Google UFS Common Model Dev
2023-08-10 cg-ufs-cpldcld Google UFS Coupled Cloud Dev
2023-08-10 cz-ufs-cmd Azure UFS Common Model Dev
2023-08-10 cz-ufs-cpldcld Azure UFS Coupled Cloud Dev
2023-10-16 ca-sfs-emc AWS SFS EMC Operations
2023-10-17 cz-sfs-emc Azure SFS EMC Operations
2023-10-17 cg-sfs-emc Google SFS EMC Operations

Appendix B: Cluster Creation Timeline

Date Cluster Type User
2023-02-01 pclusternoaav2use1 pclusterv2 Sadegh.Tabas
2023-08-04 globalworkflowdevelop pclusterv2 Henry.Winterbottom
2023-08-15 jwcufscpldcld gclusterv2 Jun.Wang
2023-08-17 cacmd pclusterv2 Henry.Winterbottom
2024-01-02 globalworkflowspackstack pclusterv2 Henry.Winterbottom
2024-02-02 gcpmlsadegh gclusterv2 Sadegh.Tabas
2024-03-25 ca8h100sadegh pclusterv2 Sadegh.Tabas
2024-04-15 sgp5cbuseast2 pclusterv2 Sadegh.Tabas
2024-04-15 sgp5cbuseast1 pclusterv2 Sadegh.Tabas
2024-05-02 highmem pclusterv2 Sadegh.Tabas
2024-05-08 sgp5cbuseast1res2 pclusterv2 Sadegh.Tabas
2024-06-13 azh100 azclusterv2 Sadegh.Tabas
2024-06-21 azureh100nd96isr azclusterv2 Sadegh.Tabas
2024-07-11 mlgefsoperational pclusterv2 Sadegh.Tabas
2024-08-12 noaajetcluster existing Terry.McGuinness
2024-08-12 noaaheracluster existing Terry.McGuinness
2024-08-12 noaagaeac5cluster existing Terry.McGuinness
2024-08-14 jwcaufscpld pclusterv2 Jun.Wang
2024-08-22 junmlgefsopn pclusterv2 Jun.Wang
2024-12-11 ca8h100east2ajun pclusterv2 Jun.Wang
2025-03-13 awsmlxiao pclusterv2 Xiao.Luo
2025-11-24 noaagaeac6cluster existing Terry.McGuinness
2025-11-24 noaaursaclustertmcg existing Terry.McGuinness

Appendix C: Data Collection Method

This report was generated automatically by querying all 26 tools in the Parallel Works MCP Server against the live PW REST API. The following tools were invoked:

Tool Data Collected
get_platform_settings Platform version, features, maintenance status
get_auth_session User identity, roles, features, sidebar modules
get_organizations Organization name, member count
get_groups + budget_summary Group budgets, member counts, allocations
get_cost_summary Aggregated budget analysis
list_clusters All 35 clusters with status, type, CSP
get_cluster_status (Γ—6) Detailed status for Terry's clusters
get_cluster_nodes Hera login node inventory
list_resources Unified resource view (23 items)
list_sessions Active sessions (1 running)
list_storage All storage types unified (33 items)
list_buckets Object storage (11 buckets)
list_nfs Network file systems (7 NFS)
list_lustre High-performance storage (14 Lustre)
list_snapshots VM snapshots (51 items)
list_ips Static IPs (1 item)
list_networks VPC networks (4 items)
list_workflows Installed workflows (1 item)
list_allocations Resource allocations (0 items)
list_kubernetes_clusters Kubernetes (0 items)
list_ml_workspaces ML workspaces (1 item)
get_notifications Notifications (0 items)

Report generated March 6, 2026. All data is point-in-time and reflects the state of the RDHPCS Hybrid Cloud platform at the moment of query. Cost estimates are approximate and based on standard AWS pricing.