Coupling and modularity - fabm-model/fabm GitHub Wiki

Introduction

While FABM provides interfaces to specify variables, parameters and source terms, it is the modeller, not FABM, who decides which processes or variables are represented, what units they use, and how the underlying code is structured. Nevertheless, some recurrent, useful patterns for structuring ("modularizing") models have emerged over time. This guide describes these patterns and the underlying FABM functionality that they use.

Common ingredients of FABM-based models

Models, instances and modules

FABM emphasizes runtime configurability of biogeochemical/ecological models. Its runtime configuration file, fabm.yaml, does not only specify the values of parameters, but also which model components are to be activated. That list may be short, for example NPZD and carbonate system, or, in more modular models, very long: 4 phytoplankton types, 2 microzooplankton types, 1 mesozooplankton types, ... The user specifies the list of components to activate as well as the connections between them. In FABM, each active component is called an instance. In fabm.yaml, each instance is given a name, a pointer to the code that contains its actual implementation (i.e., performs all calculations), and any options that the code declares as user-configurable, such as parameter values. For example, the section in ERSEM's fabm.yaml that adds an instance "P1" begins with:

P1:
  long_name: diatoms
  model: ersem/primary_producer
  parameters:
    sum: 1.375

Here, model: ersem/primary_producer points FABM to the code (specifically, the Fortran derived type) that implements the behaviour appropriate for the diatom instance. In this case, this is a generic primary producer code, flexible enough to describe any type of phytoplankton. Accordingly, additional phytoplankton instances (P2, P3, P4) introduced later in this configuration file use the same ersem/primary_producer implementation. This illustrates that multiple instances can use the same code, though they will typically set different parameter values. In the sections below, we use the term module to refer to the implementation of model instances in source code.

Variables

The central components of a FABM module are its variables: a model’s own state variables and diagnostics, and any external dependencies that describe the model environment:

  • State variables, sometimes called "prognostic variables", are initialized at the start of the simulation and are changed by providing sources and sinks (instantaneous rates of change). In the case of pelagic (3D) state variables, they additionally change due to transport (advection, diffusion) calculated by the hydrodynamic model that FABM is embedded in. The current value of a state variable is determined by its initial value and its entire history of sources, sinks, and transport processes.
  • Diagnostic variables can be calculated at any time from the model state and environment.
  • Dependencies are variables that need to be provided by an outside component, e.g., temperature from the hydrodynamic model, or pH from a carbonate system module.

A variable is linked to a specific domain upon its declaration: the “interior” (pelagic), the water surface, or the bottom. More details about the types of variables and their handling is available here.

Routines

The routines that calculate source terms and diagnostics in a FABM-based model are specific to a single domain: do operates over the pelagic, do_bottom over the bottom, do_surface over the water surface. Each of these routines may operate on slices of the spatial domain, since they contain placeholders (preprocessor macros) for spatial loops. However, they are not aware of the dimensionality of the host model (0D, 1D, 2D, 3D), the interpretation of different dimensions (e.g. depth vs. horizontal), their ordering (e.g., surface-to-bottom or bottom-to-surface), or land-sea masking. This is intentional: biogeochemical logic can then be written as local (grid-cell-specific) operations, and the host model and FABM can internally optimize spatial looping.

Where grid-aware operations are needed, they typically involved vertical loops, for example, to move stepwise through the water column from the surface to the bottom to calculate the light field. For these, FABM supports one additional routine: do_column. This processes one single column in the pelagic at a time, with explicit control over the direction (surface-to-bottom or bottom-to-surface). It should be noted that the do_column routine is often more computationally expensive to use than do; its use should therefore be limited to case where iteration over the vertical is truly essential.

Source terms and surface/bottom fluxes

FABM-based models change their state variables by providing sources: the instantaneous rate of change of the variable due to the biogeochemical/ecological processes described by the model. In FABM, these source terms can be positive (the variable increases) or negative (the variable decreases), so in practice they are equivalent to the sources-minus-sinks tracked in e.g. NEMO.

For interior [pelagic] state variables, FABM-based models can additionally prescribe surface and bottom fluxes: the instantaneous flux of the variable over the surface and bottom interface of the water column. These are positive for inward fluxes (i.e., fluxes that increase the pelagic variable).

Aggregate variables

FABM-based models can register any variable they like, in whatever unit they like. However, to facilitate coupling between models, and to allow FABM to verify conservation of mass (e.g., totals of chemical elements), FABM asks models to declare how their variables contribute to standard variables, e.g., total carbon, total nitrogen. These contributions are registered during model initialization by calling add_to_aggregate_variable. For example, the nitrogen-based NPZD example would use

call self%add_to_aggregate_variable(standard_variables%total_nitrogen, self%id_p)

in its phytoplankton module.

In models that track multiple chemical elements, this routine would be called multiple times for a biomass pool with constant stoichiometry. For example,

call self%add_to_aggregate_variable(standard_variables%total_nitrogen, self%id_p)
call self%add_to_aggregate_variable(standard_variables%total_phopshorus, self%id_p, scale_factor=1.0_rk/16.0_rk)
call self%add_to_aggregate_variable(standard_variables%total_carbon, self%id_p, scale_factor=106.0_rk/16.0_rk)

This would be appropriate for a phytoplankton pool represented in mmol N m⁻³, and a carbon : nitrogen : phosphorus elemental ratio of 106 : 16 : 1. Note that the unit of standard variables is defined as part of the standard variable and therefore fixed (typically mmol m⁻³); models that internally use different units must use the scale_factor argument to convert to standard variable units.

A modularisation example

Every FABM module contains an initialize routine that registers its state variables, diagnostics and dependencies. In the simple Nutrient-Phytoplankton-Zooplankton-Detritus (NPZD) model shown in Figure 2, this routine calls FABM’s register_state_variable routine four times:

call self%register_state_variable(self%id_n, 'n', 'mmol m-3', 'nutrients')
call self%register_state_variable(self%id_p, 'p', 'mmol m-3', 'phytoplankton')
call self%register_state_variable(self%id_z, 'z', 'mmol m-3', 'zooplankton')
call self%register_state_variable(self%id_d, 'd', 'mmol m-3', 'detritus')
image

Figure 2. The state variables and source terms of a simple NPZD model. In practice, such a model will also contain environmental dependencies, diagnostics and parameters. These are not shown here.

The source terms for each of these four state variables must be provided by the developer in an accompanying do subroutine, which processes the pelagic point by point. While doing so, it has read access to all registered state variables and dependencies, increment access to the state variables' source terms, and write access to any registered diagnostics. Note that in FABM, source terms are accumulated: they are reset to 0 by FABM when the host (hydrodynamic) model asks for the source terms and then incremented by all model instances that are active.

If we were to modularise the NPZD model, that is, to split it in multiple components coded in different sources files, one intuitive approach would be to separate its four constituents: nutrients, phytoplankton, zooplankton, detritus. Each then gets its own source file, its own initialize routine, and its own do routine to specify pelagic source terms. The result could look like Figure 3. Source calculations are now distributed over three of the four modules (p, z, d). This also shows why in FABM, each module increments its source terms instead setting them; this ensures the total sources for each of the four state variables are accumulated to ultimately equal the equations shown in Figure 2.

image

Figure 3. The first iteration of a modular NPZD model. Each coloured box represents a separate module, coded in a stand-alone source file. Cylinders represent state variables. Those with solid borders and bold font are “owned” by the module itself, those with dashed borders are state dependencies. Arrows represent the coupling links make at runtime based on the specification in fabm.yaml. For example, the nutrient state dependency of the "phytoplankton" instance is fulfilled by coupling to the nutrient state variable in the "nutrient" instance.

Each module now makes just one call to register_state_variable: the nutrient module registers "n", the phytoplankton module registers "p", the zooplankton module registers "z" and the detritus module registers "d". Thus, the final combined model still has four state variables (tracers). However, the modules compute source terms that affect not just their own state variable, but several others as well. For example, in the phytoplankton module, the difference between photosynthesis and respiration comes at the expense of the nutrients; the loss of phytoplankton due to mortality increases detritus. To represent this, FABM has the concept of state variable dependencies: state variables that are not “owned” by the module itself but must come from another active model instance. For example, in addition to its own state variable, the phytoplankton module registers two state dependencies by calling register_state_dependency:

call self%register_state_variable(self%id_p, 'p', 'mmol m-3', 'phytoplankton')
call self%register_state_dependency(self%id_n, 'n', 'mmol m-3', 'nutrients')
call self%register_state_dependency(self%id_d, 'd', 'mmol m-3', 'detritus')

It is then free to increment source terms for each of these three state variables.

There is no direct link (e.g., a Fortran use statement) made from the phytoplankton source code to the nutrient and detritus source codes. Instead, the phytoplankton module is a stand-alone code that registers its dependencies on nutrients and phytoplankton, but then leaves these to be resolved at runtime. In FABM, the specification of which modules to activate, and how they are to be coupled, is deferred till runtime when the fabm.yaml confguration file is read; there is no global compile-time register of active modules and available variables. One benefit of this is that each of the four modules is self-contained and can be compiled independently (if desired, all four can be compiled in parallel). It also means that the conversion of the original NPZD code to N, P, Z, and D components can be done in a straightforward manner:

  1. create four copies of the original code
  2. replace register_state_variable by register_state_dependency for all but the “owned” state variable
  3. drop redundant variables and source terms
  4. rename externally visible (public) Fortran modules and types (e.g., npzd -> p).

How should we allocate the individual source terms to the new stand-alone modules? For example, why is the change in detritus due to plankton mortality (mortP and mortZ) specified in the phytoplankton and zooplankton modules, rather than to the detritus module? Part of the answer is that we want to compute each process just once, as the underlying calculation may be complex and computationally expensive; we then use the result of the calculation in as few places (modules) as possible to minimize code complexity. However, this rule of thumb would still permit mortalities mortP and mortZ to be calculated in the detritus module, and their impact on p, z and d to be represented there. The argument against doing this is that it would make the detritus module dependent on the presence of the phytoplankton and zooplankton modules. Similarly, if we were to move the calculation of the grazing rate to the phytoplankton module, it would become dependent on the presence of the zooplankton module. The current partitioning of source terms has the appealing feature that it is possible to build mini-ecosystems with a subset of modules without any code changes: it is perfectly feasible to compose an NPD model where zooplankton has been removed from fabm.yaml, or even an ND model that describes remineralization of detritus only (although in the absence of a detritus source, its behaviour will not be very interesting) – all without changes to the source code. In practice, we have found that creating one module per “integral physical entity” (a free chemical compound such as dissolved molecular oxygen or nitrate, a generalized compound such as a class of particulate and dissolved organic matter, or a plankton functional type) provides a degree of modularity that is both useful and intuitive. Nevertheless, it remains perfectly possible to write monolithic modules that describe larger part of an ecosystem (e.g., the complete NPZD example) or mini-modules that describe just an individual process.

The above is sufficient to deliver a modular NPZD model. However, the individual components in Fig. 3 still clearly expect to embedded within the original NPZD context. This is most obvious in their naming of dependencies. For example, zooplankton still refers to "phytoplankton" as it prey, even though it could in principle consume any type of prey: it might be coupled to detritus as "prey" to make it a detritivore. Another loose end is the naming of the "owned" state variables. Since the instance name already describes the component (nutrient, phytoplankton, etc.), these is no need for the internal state variable to repeat that name – since in output, variables are named <instance>_<variable>, it would lead to redundancy in variable names, e.g., short names such as "p_p" and long names such as "phytoplankton_phytoplankton". Therefore, the final stage in modularizing a FABM-based model is often the renaming of variables and dependencies to (1) indicate the newly acquired flexibility, and (2) avoid repletion within variable names. The result of this step is shown for the NPZD model in Fig. 4. The changed variable names reflect that the model has in essence become a collection of four generic building blocks, which can be put together in completely new configurations. For example, it is now feasible to add a second zooplankton instance that feeds on detritus instead of phytoplankton.

image

Figure 4. The final iteration of the modularized NPZD model, in which variables and dependencies have been renamed to emphasize the flexibility of the four building blocks.

Conservation checks

One of the few overarching principles in the development of biogeochemical models is that they should conserve mass. This makes mass conservation checks one of the most valuable tools in the toolbox of biogeochemical model developers. These checks become even more powerful when models are modularised. Mass conservation is often verified by summing the total of each represented chemical element (e.g., nitrogen) across all biogeochemical state variables, and ascertaining whether this total stays the same (or if allowing for numerical inaccuracies and drift: “nearly the same”) over time within a closed model domain.

An equivalent mass conservation check can be made at the level of source terms in the biogeochemical model: if we sum the contributions of all source terms to the total change of a chemical element, this sum should be numerically indistinguishable from 0 (in practice: less than 10⁻¹⁵ times the largest source term, if using double precision). For example, summing all source terms in the nitrogen-based NPZD model (Fig 2) shows that the change in total nitrogen (n+p+z+d) equals 0. Such checks on the rate of change are generally more precise, and therefore more informative, than checks on the model state, because unlike the latter:

  1. they are not influenced by external sinks and sources (e.g., open boundaries, rivers, precipitation)
  2. they are not influenced by inaccuracies or conservation issues in numerical schemes for transport and time integration, or any clipping that these apply
  3. any errors do not accumulate over time

Mass conservation applies to all sources combined (e.g., Fig. 2), but also at the level of individual model instances: the sum of all sources of a chemical element within a single instance should also equal 0. It is easily verified in Figs 3 and 4 that the sum of all source terms is indeed zero within each of the four instances. Such checks are very useful in complex models with tens to hundreds of source terms. If these models show a gap in the mass balance, it is often tedious and time-consuming to determine which of the many source terms is at fault. However, if such a model is split over tens of model instances, mass conservation of sources can be checked on each of these instances individually. The problem then reduces to first identifying the offending module (e.g., "it is due to the pelagic source terms computed in the diatom module"), and then, within that, the offending term. This is a much quicker procedure.

Fortunately, FABM makes such granular mass conservation checking easy: if you add a single line with check_conservation: true at the top of fabm.yaml, FABM will calculate the total change in all known conserved quantities for every active module, e.g., the change in total nitrogen due to phytoplankton, the change in total nitrogen due to zooplankton, etc. This is done separately for the pelagic (i.e., sources computed from the do routine), the bottom (do_bottom routine) and surface (do_surface). Each of the resulting sums is available as a diagnostic for output, with names of <instance_name>_change_in_<standard_variable_name><domain>, with <domain> being empty (for the pelagic), “_at_surface”, or “_at_bottom”. This greatly increases the number of metrics available to for checking model validity.

For FABM to generate appropriate conservation diagnostics, it needs to be aware of the link between model variables (e.g., "phytoplankton") and conserved quantities (e.g., "total nitrogen"). These links are registered as part of model initialization. Finally, it should be noted that the generation of source conservation diagnostics is computationally costly: First, it requires FABM to trap each source term before it is added to the accumulated per-variable sources. Second, it will result in the calculation of large numbers of sums (number of model instances × number of conserved quantities × 3 for interior, bottom and surface). Thus, check_conservation may be set to true during testing, but it should be false (or absent) in production/operational simulations.

Coupling specification

As mentioned above, the coupling between modules is deferred until runtime. Specifically, the links between modules are set in the "coupling" sections of FABM's runtime configuration file, fabm.yaml. For the NPZD example from Fig. 4, that would look like:

instances:
  n:
    model: examples/npzd/n
  p:
    model: examples/npzd/p
    coupling:
      nutrient_source: n/c
      respiration_target: n/c
      mortality_target: d/c
  z:
    model: examples/npzd/z
    coupling:
      prey: p/c
      respiration_target: n/c
      mortality_target: d/c
  d:
    model: examples/npzd/d
    coupling:
      mineralization_target: n/c

FABM also supports an implicit form of coupling through the use of “standard variable” identities. If one module specifies that it depends on a standard variable named "x" and another module registers one of its state or diagnostic variables with the same identify “x”, FABM will couple them automatically. In this case, no entries in fabm.yaml are necessary.

Particles: grouped couplings and generalized access

The coupling functionality described above is suitable for many purposes, but it suffers from two problems:

  1. All variable links between models have to be specified explicitly in the FABM configuration file, fabm.yaml. For more complex models, e.g., those with many functional types and multiple elemental cycles (e.g., Fig 5), this can quickly result in coupling statements dominating the configuration file altogether. For example, ERSEM’s mesozooplankton couples to 9 prey types, and for each it needs access to 5 constituents (C, N, P, Si, CaCO3). Thus, its coupling specification in fabm.yaml would require 9×5=45 coupling statements, each on its own line. This becomes difficult to manage and error prone.
  2. Variable links can only be made if the supplying and receiving model both use the same units. This is fine when components are all designed and implemented by the same institute/author, but problematic when they come from different institutes/authors. That is because different authors often prefer different units (e.g., ERSEM uses mg C, mmol N, mmol P where PISCES uses mol C throughout). This would hinder interoperability of model components.
image

Figure 5. Coupling in more complex models with multiple functional types and multiple chemical elements. Rectangles with rounded corners represent model instances, cylinders represent state variables, rectangles read-only dependencies. Variables with solid borders are “owned” by the instance that contains them; those with dashed borders are dependencies. Simple variable dependencies (e.g., phosphate in diatoms, temperature in diatoms and mesozooplankton) are resolved as before. Other dependencies are grouped. For example, the diatom module registers dependencies on the individual components of detritus (where waste is to be deposited). These components are grouped together by a single dependency on a model instance named “detritus”. As a result, the user only needs to specify a target for “detritus” (e.g., to point it to the medium-size particulate organic matter pool). Connections to its individual C, N, P, Si, Fe constituents are made automatically. Similarly, mesozooplankton accepts multiple prey, each quantified by several constituents (C, N, etc.). For each prey, these are grouped via a dependency on a single model instance “prey1”, “prey2”, etc. These high-level couplings (prey1: diatoms) are the only ones the user needs to specify in fabm.yaml.

To address these issues, FABM introduces the concept of a particle model, representing a coherent physical entity. It is implemented in type_particle_model, defined in the fabm_particle module. Models that inherit from this type acquire the ability to couple to entire model instances by name, and to automatically set up links to variables from such a coupled instance. For example, ERSEM’s mesozooplankton couples to 9 prey instances (set in fabm.yaml), and from each of those requests (in code) its total carbon, nitrogen, phosphorus, silicon and calcite. To make this possible, a new section was added at the end of the mesozooplankton initialize routine:

do iprey = 1, self%nprey
   write (index,'(i0)') iprey
   call self%register_model_dependency(self%id_prey(iprey), 'prey' // trim(index))
   call self%request_coupling_to_model(self%id_preyc(iprey), self%id_prey(iprey), standard_variables%total_carbon)
   call self%request_coupling_to_model(self%id_preyn(iprey), self%id_prey(iprey), standard_variables%total_nitrogen)
   call self%request_coupling_to_model(self%id_preyp(iprey), self%id_prey(iprey), standard_variables%total_phosphorus)
   call self%request_coupling_to_model(self%id_preys(iprey), self%id_prey(iprey), standard_variables%total_silicate)
   call self%request_coupling_to_model(self%id_preyl(iprey), self%id_prey(iprey), total_calcite_in_biota)
end do

The APIs register_model_dependency and request_coupling_to_model were introduced with type_particle_model. register_model_dependency registers a dependency on an entire model instance, identified by name and to be coupled at runtime in fabm.yaml. For instance, in the above, the mesozooplankton module registers a dependency on "prey1", "prey2". Each of these is a model instance (e.g. "diatoms"), i.e., not a single variable but a collection of variables. In the next step, request_coupling_to_model is called to request specific variables from the coupled instance (e.g., “diatom carbon”). These requests use standard variables (e.g., total_carbon), which allows them to work independent of implementation details (e.g., variable names and units) of the coupled instance. In the above example, totals of different chemical elements (total_carbon, etc.) from each prey instance are linked to previously registered dependencies (self%id_preyc, etc.). Thus, the above lines supplement but not replace the variable-specific registration commands in mesozooplankton’s initialize routine. The lines are added to group couplings together (the user can couple prey1 in one go in fabm.yaml, instead of micromanaging links to prey1c, prey1n, etc.) and to link them based on standard identities (total_carbon, etc.) with known units.

In the example above, total contained carbon, nitrogen, phosphorus, silicon and calcite are obtained for every prey type. The standard variables used for this purpose, e.g., total_carbon, have fixed units, typically mmol m⁻³. FABM guarantees that the requesting model (mesozooplankton) will receive the requested variable in these fixed units. This is non-trivial, as a few examples show:

  • Many prey types do not have silicon or calcite. This is known to FABM, as the prey instance (e.g., microzooplankton) then does not call add_to_aggregate_variable for total_silicate or total_calcite_in_biota. In response, FABM couples mesozooplankton’s corresponding id_preys(iprey) and/or id_preyl(iprey) to a field filled with zeros.
  • ERSEM's functional types express carbon in mg C m⁻³, not mmol C m⁻³ implied by the total_carbon standard variable. This is known to FABM, as the prey instance (e.g., microzooplankton) calls add_to_aggregate_variable for total_carbon with a scale factor of 1/12.011 (12.011 representing the atomic mass of carbon). In response, FABM creates a temporary variable for prey in mmol C m⁻³ in the background. FABM calculates the value of this new variable on demand by dividing prey carbon in mg C m⁻³ by 12.011 and providing the result to mesozooplankton's id_preyc(iprey).
  • If a prey contains multiple state variables that contribute to total carbon, FABM calculates the sum of these variables in mmol C m⁻³ on demand and provides that to mesozooplankton.

Thanks to such automatic unit conversions, the predator does not need to be aware which elements the prey does or does not contain. Correct links between predator and prey are made even if they internally operate in different units.

The particle-specific coupling described above allows the requesting model to always see (i.e., read) all desired prey properties in predictable units. However, this functionality is not sufficient to apply grazing losses to prey. Predators generally calculate a specific loss rate (e.g., in d⁻¹) that applies to prey biomass as a whole, i.e., to all its constituents. If that prey is defined in carbon units only, with fixed C:N and C:P, that specific loss rate should apply only to its carbon state variable. If the prey has variable stoichiometry with explicit state variables for C, N and P, the specific loss rate should apply to all three state variables. In both cases, the predator sees non-zero values for prey carbon, nitrogen and phosphorus – it cannot tell the prey types apart. How can it apply the correct loss rates without knowing the internal structure of the coupled prey instance? The current solution to this is to have the predator loop over all state variables of the prey, without knowing their identity, and to then apply the same specific loss rate to each. For example,

do iprey = 1, self%nprey
   do istate = 1, size(self%id_prey(iprey)%state)
      _GET_(self%id_prey(iprey)%state(istate), preyP)
      _ADD_SOURCE_(self%id_prey(iprey)%state(istate), -sprey(iprey)*preyP)
   end do
end do

Here, sprey(iprey) is the previously calculated specific loss rate, different for each prey type. An added bonus of this approach is that any non-tracked state variables of the prey are destroyed along with all others, since they are included in the state array in the example. In ERSEM, that means that chlorophyll of phytoplankton functional types is (correctly) destroyed by predation, even though the predator code does not handle chlorophyll explicitly.

Child models and mapping across domains

Even for a module for a single physical particle (e.g., a phytoplankton or zooplankton functional type), it can be convenient to split calculations over different routines. In some cases, it is preferable or even unavoidable to place these routines into separate modules. For example: ERSEM’s mesozooplankton switches between active and overwintering behaviour based on the quantity of depth-integrated prey. All its source terms and diagnostics are pelagic and thus implemented in a do routine. For the overwintering behaviour, calculations in this routine have a dependency on the depth integral of the sum of all its 9 prey types. This dependency is registered at module (mesozooplankton) level. In FABM, calculation of depth integrals must be done in a do_column routine; summation of pelagic variables is best done in a do routine. It is worth recalling that the do routine cannot do the job on its own, as it performs local operations in the pelagic without being aware of different spatial dimensions and their interpretation; moreover, it typically only sees a fraction of the pelagic at a time, and this fraction typically does not encompass all of the vertical. Thus, a combination of do and do_column is unavoidable. The ideal processing order would be:

  1. a do routine calculates the sum over all prey;
  2. a do_column routine depth-integrates this sum;
  3. a do routine uses the depth integral to determine whether to activate overwintering; based on that, it calculates all source terms and diagnostics.

This three-step logic cannot be implemented in a single mesozooplankton type, since that only has a single do routine. The solution to this issue is to place the calculation of the sum and of the depth integral in separate modules, created as children of the mesozooplankton module (i.e., they are created and configured in code, not by the user through fabm.yaml). To support this, FABM allows modules to allocate and configure model instances of any type and to add them by calling the FABM’s built-in add_child subroutine. FABM already has built-in modules for summation and depth integration. Therefore, a simple implementation in mesozooplankton’s initialize could look like:

use fabm_builtin_sum, only: type_weighted_sum
use fabm_builtin_depth_integral, only: type_depth_integral

class (type_weighted_sum), pointer :: totprey_summation
class (type_depth_integral), pointer :: totprey_integrator

! Set up the sum over all prey types
allocate(totprey_summation)
do iprey = 1, self%nprey
   write (index,'(i0)') iprey
   call totprey_summation%add_component('../prey' // trim(index) // 'c')
end do
call self%add_child(totprey_summation, 'totprey_summation')

! Set up the depth integral
allocate(totprey_integrator)
call self%add_child(totprey_integrator, 'totprey_integrator')
call totprey_integrator%request_coupling('source', '../totprey_summation/result')

! Couple our dependency on depth-integrated prey to the child model calculating it
call self%register_dependency(self%id_intprey, 'intprey', 'mg C m-2', 'depth-integrated prey')
call self%request_coupling(self%id_intprey, '../totprey_integrator/result')

This creates two children of mesozooplankton: totprey_summation calculates the sum of all prey across the pelagic, totprey_integrator subsequently calculates its depth integral. The terms to include in the summation are set by calling add_component. This is a custom routine of type_weighted_sum that ensures a dependency on the term will be set up and coupled to the specified named variable (the '../prey' // trim(index) // 'c' argument provided to add_component). The field to integrate is set by coupling the "source" of the integrator to the "result" of the sum, by calling request_coupling. The final dependency on depth-integrated prey is linked to the result of the depth integration by another call to request_coupling.

These operations happen under the hood and are not visible to the user of your model, or configurable in fabm.yaml, since the responsible models are created and added (with add_child) and coupled (with request_coupling) automatically.

You may note that the above design can only work if FABM respects the intended call order: summation, then depth integration, then mesozooplankton sources. This is the case. FABM infers from the dependencies between the three modules which needs to be called first and guarantees that this will be done correctly.

Your diagnostic, my state variable

The standard rules for coupling variables between modules are straightforward: models can either register a plain dependency (access: read-only) that is fulfilled by coupling to a state variable or diagnostic from another model, or they can register a state variable dependency (access: read value, increment sources) that must be fulfilled by coupling to a state variable from another model (e.g., Fig 3). FABM enforces this: you cannot couple a state variable dependency to a diagnostic variable, as there is then no destination for the source terms associated with the state variable dependency.

However, in some scenarios it is helpful if these rules can be relaxed. For example, many biogeochemical models calculate the change in alkalinity that results from their uptake or exudation of dissolved compounds. Thus, they register a state variable dependency on total alkalinity. Still, someone may want to use these model components with a simple carbonate system in which alkalinity is parameterized as function of salinity: it is a diagnostic rather than a state variable. The simple carbonate module can prepare for this scenario by explicitly stating that other modules may try to provide source terms for its alkalinity variable. This is done by registering it with argument act_as_state_variable=.true. FABM will then allow another module (e.g., phytoplankton) with a state variable dependency on alkalinity to couple to this diagnostic. Any source terms (or surface/bottom fluxes) that it provides for alkalinity will then simply be discarded.

A more interesting scenario is when the source terms (or surface/bottom fluxes) for such a diagnostic-masquerading-as-state-variable should not be discarded but collected and processed in some way. For example, when ERSEM is configured to use parameterized alkalinity, it calculates an abiotic reference value of alkalinity from salinity, but it simultaneously tracks the change in alkalinity due to biological processes in a new state variable called "bioalkalinity". The (diagnostic) total alkalinity is computed as the sum of the salinity-based reference value and the current bioalkalinity. In this way, the impact of biogeochemical processes on alkalinity is still represented. To achieve this in FABM, the diagnostic total alkalinity is registered with act_as_state_variable=.true. ERSEM’s modules will couple to this variable and calculate source terms for it. These terms are now no longer discarded, but applied instead to the bioalkalinity state variable. That is achieved in a few lines:

call self%register_diagnostic_variable(self%id_TA_diag, 'TA', 'mmol/m^3', 'total alkalinity', act_as_state_variable=.true., standard_variable=standard_variables%alkalinity_expressed_as_mole_equivalent)
call self%register_state_variable(self%id_bioalk, 'bioalk', 'mmol/m^3', 'bioalkalinity')
call copy_fluxes(self, self%id_TA_diag, self%id_bioalk)

Here, the copy_fluxes routine takes the total sources for the diagnostic total alkalinity and redirects them to state variable bioalkalinity. The benefit of this implementation is that all other modules (phytoplankton etc.) do not need to know whether alkalinity is purely diagnostic, diagnostic with representation of bioalkalinity, or fully prognostic (a normal state variable). Their code, configuration and coupling can stay exactly the same.

The ability to make diagnostics masquerade as state variable may seem exotic, but it provides vital functionality that would be difficult to implement in any other way. It underpins, among others, ERSEM’s implicit representation the vertical structure of the sediment. For example, ERSEM represent a class of organic matter in the sediment with just two state variables: the depth-integrated density (mmol C m-2, mmol N m-2, etc) and a mean penetration depth (m). The concentration of organic matter is assumed to be exponentially distributed in depth (the highest concentration at the sediment surface, asymptotically approaching 0 at depth); this allows us to diagnose the quantity of organic matter over a given depth interval from the density and penetration depth. Each type of benthic fauna has a habitat in the sediment defined by depth range; this defines among other the amount of particulate organic matter available as food. Thus, its food is a diagnostic calculated from the organic matter integrated over the sediment column, the penetration depth, and depth range that the fauna inhabits. From the perspective of the fauna, the food is a state variable. Accordingly, organic matter calculated over a specified depth interval is registered with act_as_state_variable=.true. The change in food calculated by benthic fauna is not discarded but used twice: the (negative) source term is directly applied to the state variable for column-integrated organic matter, and additionally converted into a rate of change of penetration depth.

⚠️ **GitHub.com Fallback** ⚠️