ET ACDM 2022 5 WG Topic 4 - wmo-im/et-acdm GitHub Wiki

Topic 4

  • Consider other GAW ACDM vs topics like
    • near-real-time data
    • ‘cloud’ infrastructure
    • Integration of mass data sources (IoT-based, non-traditional sources)
    • Model data
    • Big public (e.g. ECMWF, EEA, …) and private (e.g., IBM, Google) service providers?
    • Possible role of large national infrastructures (China, India, US) in GAW ACDM

Team B Notes

- near-real-time data
       * delivery should not be from individual sites, but from established data centers
       * WIS2 offers a new option for delivery, easier and more accessible than WIS1
       * NRT delivery should allow for multiple options (existing, WIS2, etc) 

- ‘cloud’ infrastructure
       * pressure is growing to utilize cloud services, at minimum for cloud backups. 
       * overhead involved in learning cloud-speak, work flows, etc (upfront cost and time)
             * fundamentally different work flows and business model, requires alot of at-cost engineering help
       * motivation is unclear, smaller efforts really need to deploy data to cloud? vs existing access
       * positives, scalability (very simple)
       * cloud backups provide safety from on-prem loss
       * somewhat free from org IT security 

- Integration of mass data sources (IoT-based, non-traditional sources)
       * SAG and GAW leadership should determine if such data sources are acceptable in GAW first

- Model data
       * responses non-operational models (e.g., custom WRF for field campaigns, etc)
       * there is a need for capturing smaller atmos comp modeling efforts (archive, DOI, etc)

- Big public (e.g. ECMWF, EEA, …) and private (e.g., IBM, Google) service providers?
       * CDS at public providers:  need for data licensing, data mirror issues? Is CDS storing data locally or pointing back to data center. Regardless, could be opaque to CDS users.

- Possible role of large national infrastructures (China, India, US) in GAW ACDM
       * SAG and GAW leadership issue?

Team A discussion about Topic 4

In NDACC they have a metadata rapid delivery but they don't want the data to be copied. From NDACC data would be pushed to wis 2.0, then the data users will have a single point for pulling data. Which metrics, metadata attributes the wis 2.0 must have? wis 2.0 it needs to develop the metrics yet. NDACC NRT data was required by ESA. For GHG monitoring initiative NRT data would be important. Would need meas. protocols or regional processing centres, ICOS can be an example. WDC have a different focus, no capacity for NRT. WDCGG no QC/QA in NRT and no archived, just streamed to users. ICOS could do NRT on GHG but would be difficult to change the way of working. What would be added to the NRT data? version number? Starting with the existing facilities: NOAA, ICOS or WDCGG; engage with the local communities in regions where capacity building is needed. 'cloud infrastructure' unavoidable, everything is moving into. Public recommended. Security issues and data control need to be properly addressed. The architecture becomes more flexible, more modular, more easily transferable if a given center has to give up a service. Issues about reliability with non-commercial clouds, mechanisms of mitigation exist. Perhaps a WMO cloud would be an idea with WDC dealing with their portions. it can be outsourced. Recommendation to be cloud ready, docker. WDCGG don't use cloud, is planning to change or changed 2 years ago their system and it's not cloud. To be cloud ready needs to be discussed if there will have advantages for contributing networks, what is the gain for the contributing networks? Contributing networks should decide independently. JMA is developing a parallel computer to assimilate more satellite data in 2 years. Service continuity as motivation for efforts to become cloud ready, cloud readiness as part of a contingency plan. All WDC need to be open source recommended. NASA supports user registration. You should be able to look at data or metadata without need to log in. If no visibility, the potential users escape.

Draft for IP:

WDCs to document their service continuity contingency plans and carry out a peer review process by 2027. Recommendation for all WDCs to become at least cloud-ready, recommendation to DCs of contributing networks to evaluate the benefit of becoming cloud-ready to their operations.

ET-ACDM to work with SC-IMT and RB SSC EPAC to establish requirements for including crowd-sourced and/or IoT-based atmospheric composition data in the portfolio of data management and archiving of WDCs.

ET-ACDM to work with SC-IMT and RB SSC EPAC to establish cooperation with national centers of large WMO Members (specifically, China, India, Indonesia) and/or under-represented Regions (specifically, Africa) to improve access and exchange of atmospheric composition data.

ET-ACDM to promote and advise on the use of WIS2.0 for rapid delivery of atmospheric composition data, in particular GHG, ozone and aerosol in support of service providers such as the WMO GHG Initiative, CAMS, GURME, NCEP re-analysis, JMA, GMAO, etc.