Overhauling memory allocation and filters - ESCOMP/CTSM GitHub Wiki

This wiki page contains some big picture thoughts connected to the project, https://github.com/ESCOMP/CTSM/projects/32.

Overview

We currently allocate memory for all sub-grid elements that may need to come into existence at any point in a simulation, and use filters to exclude zero-weight, inactive elements. This architecture may lead to performance issues due to both cache inefficiencies and filters preventing vectorization.

An alternative architecture would involve only allocating memory for currently-active points, changing our filter-based loops to instead operate on a contiguous range of array elements (all elements for one or more contiguous landunits), and - no more than once a year (because this is an expensive operation) - reallocating memory as necessary (migrating all state variables to the newly-allocated arrays).

Bill's vision August 2020

My current vision is that we can either (1) have arrays that only apply to one landunit or a select subset of landunits; or (2) keep arrays applying over all landunits (I'm currently leaning towards (2), at least in the first iteration). In either case, arrays would only contain currently-active subgrid elements. We would do away with most / all filters. If we go with idea (2), we would instead do something like specifying the landunit(s) over which a given loop should operate; from this set of landunits, we could determine the beginning and ending indices over which we operate; we would then loop over all array elements in this range. In the case where we are operating over all points in a single landunit type or contiguous landunit types (e.g., natural vegetation and crop landunits), the relevant subgrid elements should already be contiguous in memory, because of the landunit-centric way in which memory is laid out in CTSM. In the (unusual, I think) case that a block of code needs to operate on multiple sets of non-contiguous subgrid-elements, we could call the given subroutine multiple times, so that it operates on a contiguous set of elements in each call.

A challenging part of this design will be doing the memory reallocation when active points come into / out of existence. To facilitate this, we will probably want to have structure(s) that iterate over all state variables (https://github.com/ESCOMP/CTSM/issues/282). Note that it should be sufficient to handle (1) variables needed on restart files, and (2) diagnostic variables for history streams that span both sides of a reallocation time. (2) may be messy, and we might just want to not allow this. Both because this reallocation is likely to be expensive and because of the complexities of (2), we will probably want to limit this reallocation to only be allowed on the year boundary.

A side benefit of this rework will be that it will lend itself well to decreasing restart file sizes and preventing the need for running init_interp for a certain class of model changes (https://github.com/ESCOMP/CTSM/issues/18).

See https://github.com/ESCOMP/CTSM/issues/829 for some other big-picture thoughts related to this.