Meeting Notes 2016 Science - ESCOMP/CTSM GitHub Wiki

November 17, 2016

Discussion

Note that Noah-MP has urban, carbon and crop models. For the first year, we won't try to tackle them. But we could envision trying to bring in the full Noah-MP eventually.

Gordon: Ideally, rather than just having multiple options for photosynthesis, etc., we really want to decide what we want photosynthesis to look like. (But that's a longer-term effort.)

There is general agreement about the path forward in terms of bringing Noah-MP functionality into CLM - as opposed to splitting out biogeophysics into a shared package that is used by both CLM and Noah-MP (because some feel that pulling out all of biogeophysics into some shared library would not be feasible in a year).

Encouragement to think at least a bit about APIs for coupling with other land surface components (biogeochemistry, etc.) while we're doing the 1st year work, to facilitate work that might be needed in the future.

Name

Name: Community Terrestrial Systems Model

Design

Martyn: Fundamentally, break things down in terms of fluxes - first at a coarse level of granularity, then can break it down further.

Martyn: Would like to separate physics and numerics: e.g., have a routine that calculates fluxes through the soil, then have a separate piece of code that does the tridiagonal solve.

Rosie: In the FATES world, Brad Christopherson(?) has implemented something general in terms of a general state representation.

Tension between having something general (general and flexible state, expressed as a vector / matrix) vs. something understandable, with names that help someone understand the code....

  • Rosie: the original ED managed this by having functions to compute fluxes using named variables, but then these were packed into generic vectors for the runge-kutta solve... then later unpacked into named variables
    • But she abandoned that because, when the solver failed, it was very hard to figure out why
      • The point was made that partly this depends on having a lexical mapping for the various pieces of the vector / matrix - e.g., a clear attachment of names to each element

More discussion of a big all-at-once solve

Gordon echoes the difficulty (noted above) in tracking down the problem when you have a big state vector / all-at-once solve.

Martyn: First, can try reducing the time step.

Martyn: But also, he has built in some automated backing off: If the full solution doesn't converge, then it automatically backs off to split solutions, where you solve pieces of the solution rather than all at once. If these still give problems, then it backs off to explicit Euler with very small time steps.

But what if there's truly a bug? Martyn: then it takes forever to run.

Rosie & Gordon: the key thing here is to point you to what part of the model has gone bad.

Dave: do you have some sort of checks of time to convergence? Because it seems like, if you weren't paying attention, you might not notice it. Martyn: can track number of iterations to convergence.

Another idea is checking for things getting outside of plausible ranges, so you can produce a sensible error message before actually crashing the code or reaching non-convergence

Starting point

Gordon agrees that a good starting point is soil hydrology / Richards equation

Gordon: a standard problem in land surface modeling is that the fluxes depend on the Obukhov length, but the Obukhov length depends on the fluxes... this same problem is in the multi-layer model but there it's bigger because it happens for every layer. There are similar circular dependencies between temperature and conductance, etc. Again, the multi-layer canopy just accentuates the problem.

git

There was general agreement about moving CLM / CTSM to git

Requirements (from document)

  1. Backwards compatibility: Reproduce Noah-MP and CLM capabilities (i.e., do not lose key science capabilities). Only require backwards compatibility "in spirit" - focus on capabilities rather than bit-for-bit matching;

    High priority for Martyn; others agree

  2. Enable and enhance active engagement with external collaborators (e.g., non-obtuse model design);
  3. Support capabilities to comprehensively experiment with different representations of physical processes (parameters and process parameterizations) and different spatial configurations (similarity concepts, ecosystem demography);

    Different spatial configurations: One example is the hillslopes. Another would be having the flexibility to go to one column per pft.

  4. Support model instantiations of varying complexity (e.g., with different state configurations, application of similarity concepts, etc.). Similarly, meet the requirement of short run times for some application areas (e.g., 5 seconds per model element per year), to increase model adoption/use, reduce forecast latency, and enable the use of large ensembles and extensive model analysis;

    Performance: In the CLM context, a "model element" is a PFT.

    We need to get more information on performance to pin down some concrete numbers.

    One good thing to look at would be to look at the CTSM instantiation of Noah-MP vs. the Noah-MP timings, and ask why they differ.

    People feel the performance thing is a pretty important requirement - maybe accomplished partly by being able to turn off various big parameterizations.

  5. Support process coupling at multiple levels of granularity, "tight" coupling of hydrology and thermodynamics; "loose" coupling with external models simulating ecosystem demography, groundwater/rivers, crops, etc.

    The coupling with other components will not be a priority for year 1.

    However, there could be some interaction with FATES in year 1. FATES complicates things because of the vertical structure. Because of the complexity with that, it's evolving so that FATES computes most things that are computed at a patch-level.

    • In principle, FATES could pass a description of the canopy structure to the host model, and the host model could calculate the fluxes. But that's not really feasible in practice, since FATES needs to operate within various hosts, like ACME-Land, and it's not feasible for them to handle the FATES structure.
    • Gordon: Let's not get too hung up on the FATES issues, because that could be a quagmire

    Mike: We'll need to think about this in terms of integrating new models like a snow model or crop model.

    • Dave & Rosie feel those models would need to be split apart, and integrated piece-meal - rather than bolting on the whole thing.
  6. Use robust and efficient numerical solution methods, including solution methods of different complexity and granularity;

    We'll start with local solvers - so pieces of this are not high priorities for year 1

  7. Improve the model driver, including simplifying porting to multiple platforms, to enable widespread use across multiple application areas.

    This is a year 2 thing.

    This is partly related to the LILAC effort - making CTSM be callable directly from an atmosphere model.

    Martyn: But this also relates to making it easier to configure and run the model offline for some regional domain.

September 29, 2016

Martyn's initial presentation

Model design

Modular structure; separate physics and numerics

  • Flexibility in process parameterizations
    • parameters & parameterizations
    • capability to use different state configurations / subsets
  • Flexibility in spatial configurations
    • spatial variability
    • multi-scale lateral connectivity
  • Numerical solution methods

Model requirements

  • Lightweight driver / easy to port, to enable widespread use in multiple applications
  • Capability to configure model instantiations with short run times (5 sec / model element / year)
  • Comprehensive testing from unit to system test level
  • Capability for ensembles
  • Facilitate active engagement with external collaborators
    • modular design
    • governance

Some discussion

Need to figure out priorities of the different pieces. e.g., lateral connectivity and separating numerics could add a lot of time to this.

Reproduction of Noah-MP

  • Don't need all possible options in Noah-MP
  • Don't need to actually include Noah-MP code itself... more about including parameterizations that are in Noah-MP

Question of whether it's really practical to expect current Noah users to pick up the new model immediately....

  • Dave L: initial goal would be to show that we at NCAR are using this model

Mariana: Would this be developed within the CESM/CLM development effort, or separately?

Proliferation of options....

  • Within WRF, there is a "physics panel", which decides whether a new parameterization is different enough from existing parameterizations.
  • Noah-MP has been more open to new things coming in
  • There's also the backwards compatibility issue: WRF tries to maintain backwards compatibility

Dave L: Backwards compatibility issue in CLM: It can be hard to maintain full backwards compatibility, down to tiny little changes. People generally feel that it's not feasible to maintain complete backwards compatibility.

Rosie: Is there a happy medium, where you reproduce one version back, but not forever back?

Computational cost

  • General sense is that, if we could truly configure CLM to reproduce Noah-MP physics, that we could likely get similar computational cost. We'd have to do some things like decrease the number of vertical layers, etc.
  • Rosie: the expensive thing in CLM is the photosynthesis solve - and that's due to the iteration around temperature, etc.

Dave L: The other big thing we could do is limit the number of PFTs per tile. Actually, it sounds like they typically run Noah-MP at super-high resolution (~ 1 km), in which case there is only one tile per cell.

Mariana: Would be good to up-front identify infrastructure work that's needed

Mariana: Need to think about what's on the surface dataset, and how this connects to internal data structures.

  • Fei: Agrees this is important: In NWP, typically specify many more things via datasets

NWP Perspective (Mike Barlage and Fei Chen)

Time scale flexibility: 1 hour to 1 second range for current applications

  • Some issues with super high frequency
    • WRF uses single precision, so evolution of lower soil layers can flat-line
    • Urban modeling: when you turn on the sun, the evolution of building temp (?) can go haywire

Coupling to atmosphere: Rather than one-layer flux coupling: instead, blending height, vegetation/building protrusion

  • e.g., allowing buildings to protrude into the boundary layer led to a big improvement; required changes in WRF atmosphere model
  • This is something to think about longer-term

Somewhat related to this, Mariana points out that NOAA is interested in adopting cime, which will make plugging the unified model into NOAA systems easier. Could implicit coupling through NUOPC help with this?

Further discussion

Rosie: From her experience with the UK met office: Having the met office & climate applications linked, it was really hard to do anything novel. We have to navigate that somehow - e.g., by having different options, some more conservative than others.

Path forward

It sounds like there's a lot to do to get the Noah-MP configuration options into CLM (including everything needed to specify boundary conditions). Maybe the way to go ahead, then, is to put the primary focus on getting Noah-MP options into CLM, and just doing refactoring as appropriate as we go. (As opposed to doing a bunch of up-front refactoring.)

  • But this refactoring should keep in mind longer-term vision / goals, like moving towards greater modularity, separating fluxes from state updates, etc.

Probably include Gordon's multi-layer canopy in this initial effort.

Can we defer lateral connectivity?

Dave L: First piece may be to pull apart the photosynthesis scheme, making that more modular.

Mariana: In principle, we could pull out photosynthesis as something really modular, like FATES

  • Though Bill S disagrees with this: thinks this requires large additional effort, which is probably only justified for big things where there is a true community desire to have this pluggable into other models

Coding standards

Is it a problem to use lots of language features that people may not understand?

Mariana and Bill S's response is: We can continue to try to keep the required knowledge of these features low, not intruding into the science code.

Dave L: We can take this opportunity to lead - not go for lowest common denominator

Discussion of modularity

Mike points out: Within Noah-MP, there is a group that wants to bring a new crop model in. But it has its own photosynthesis, etc. Noah-MP people would prefer that photosynthesis be more modular, shared between parameterizations.

With respect to FATES: It now has its own photosynthesis (which is what raised Mike's question). This is partly because of the different structure in FATES (complex multi-layer canopy?), and partly because they have very little control over what happens in ACME.

Rosie: FATES

There may be a need to have tighter coupling between soil and plants.

Inputs & outputs from a parameterization

Mike points out: A lot of determining what can be modularized is looking at the required inputs & outputs from each parameterization. From that, you can determine which parameterizations can and cannot work together. He suggests having something like a pre-processor that says, "you can't use that parameterization, because the required inputs aren't available".

⚠️ **GitHub.com Fallback** ⚠️