3 Simulation - merrillrudd/LIME GitHub Wiki

Simulation

The LIME package can be used to generate the expected size composition, other age-structured outputs (e.g. abundance-at-age, catch-at-age, annual recruitment), SPR, and MSY for a given set of biological parameters and fishing mortality and recruitment patterns.

Specify biological inputs and starting values

The LIME package simulation and estimation features require the biological inputs and starting values for selectivity parameters in a list. Using create_lh_list will make sure all required elements are included within the list and in the required format. The create_lh_list function requires inputs for some parameters and includes default values for others.

Minimum inputs for `create_lh_list`

The user is required to input the following parameters to create_lh_list for simulation and estimation:

Minium inputs: Biology

von Bertalanffy asymptotic length (linf)
von Bertalanffy growth coefficient (vbk)
Annual natural mortality (M)
Length-weight scaling parameter (lwa)
Length-weight allometric parameter (lwb)
Length or age at 50% maturity (M50)
Whether M50 input is in "length" or "age" (maturity_input)

Minimum inputs: Exploitation

Length at 50% selectivity (S50)
Whether S50 input is in "length" or "age" (selex_input)

Assumptions using the default settings

Width of the length classes (binwidth) equal to 1
von Bertalanffy length at age=0 (t0) equal to -0.01
Selectivity is assumed to follow a logistic function.
If M95 and S95 are unspecified, their defaults are NULL and we assume one-parameter logistic maturity and/or selectivity. This results in close to knife-edge maturity or selectivity.

Other inputs for `create_lh_list`

Other inputs: Biology

Length or age at 95% maturity (M95): Default=NULL for one-parameter logistic maturity; specifying M95 will use two-parameter logistic maturity curve. It is assumed M95 should be input as length if M50 is length, same for age.
Length at age=0 (t0): Default=-0.01, should be adjusted based on local studies or meta-analysis.
Equilibrium recruitment (R0): Default=1.0. Changing R0 will change the scale of the population size. It is recommended to set an initial value of R0 more appropriate to the expected population size if using catch data to estimate population size with LIME.
Steepness (h): Default=1.0 such that the expected recruitment calculated in the Beverton-Holt stock-recruit curve is not affected by the level of spawning biomass. Setting h below 1.0 will lead to the expected annual recruitment as a function of the spawning biomass. LIME will still calculate recruitment deviates around this expected value of annual recruitment.
Maximum age (AgeMax): Default= the age at which 1% of the population is still alive based on the specified natural mortality rate M. This age can be adjusted based on local evidence of maximum age.
Recruitment autocorrelation (rho): Used in simulation only. Default=0 (no recruitment autocorrelation). Changing the value of rho will add autocorrelation to the generated recruitment time series for a simulation study.

Other inputs: Exploitation

Length or age at 95% selectivity (S95): Default=NULL for one-parameter logistic selectivity; specifying S95 will use two-parameter logistic selectivity curve. It is assumed S95 should be input as length if S50 is length, same for age.
Selectivity function (selex_type): Default=logistic using a one-parameter logistic selectivity curve if S95 is not specified, or a two-parameter logistic curve if S95 is specified. Alternate option=dome to generate a population exploited based on a dome-shaped selectivity curve. Dome-shaped selectivity cannot currently be estimated in LIME, but specifying dome-shaped selectivity in the create_lh_list function can generate an assumed dome-shaped curve to be fixed when using LIME for estimation.
Dome-shaped selectivity right-hand standard deviation (dome_sd): Default=NULL for logistic selectivity, must be specified if selex_type=dome. The standard deviation of the normal distribution for the right side of the selectivity curve (after the logistic curve specified reaches 100% selectivity).
Catchability coefficient (qcoef): Estimated when using an abundance index, default=1e-5, starting value should be adjusted to a reasonable number based on the abundance index and starting value for the scaling parameter R0.
Dirichlet-multinomial parameter (theta): Default=10, a relatively large number coinciding with no variance inflation where the effective sample size will approach the input sample size (Thorson et al. 2017)
Number of seasons in a year (nseasons): Default=1 for annual length composition data and annual growth and mortality parameters. For short-lived species, it is helpful to use a shorter-than-annual time step to account for the growth of short-lived fish during one year, if length data is collected on time steps less than one year (e.g. month, nseasons=12).

Other inputs: Variation

Coefficient of variation around the growth curve (CVlen): Default=0.1; Should be adjusted based on the expected variability in the age-length curve.
Recruitment standard deviation (SigmaR): Default=0.737, the median across all fish species (Thorson et al. 2014); Used for generating recruitment deviates in the simulation or as a starting value for its estimation using LIME.
Fishing mortality standard deviation (SigmaF): Default=0.2; Used for generating fishing mortality deviates in the simulation or as the standard deviation of the fishing mortality penalty when estimating annual fishing mortality using LIME.
Catch standard deviation (SigmaC): Default=0.2; Used as the fixed standard deviation in the lognormal likelihood when fitting LIME to catch data.
Index standard deviation (SigmaI): Default=0.2; Used as the fixed standard deviation in the lognormal likelihood when fitting LIME to index data.

Now let's populate the create_lh_list function with example values.

lh <- create_lh_list(vbk=0.21,
                     linf=65,
                     t0=-0.01,
                     lwa=0.0245,
                     lwb=2.79,
                     M=0.27,
                     M50=34,
                     M95=NULL,
                     maturity_input="length",
                     S50=20,
                     S95=26,
                     selex_input="length",
                     binwidth=1,
                     CVlen=0.1,
                     SigmaR=0.737,
                     SigmaF=0.2,
                     R0=1,
                     rho=0.43,
                     nseasons=1)

Plot the biological and selectivity inputs

And then check out the biological parameters and selectivity we've created. Note that even though we input maturity and selectivity by length, create_lh_list converts to age and outputs both age (lh$S_a) and length (lh$S_l).

par(mfrow=c(2,2), mar=c(4,4,3,1))
plot(lh$L_a, type="l", lwd=4, xlab="Age", ylab="Length (cm)")
plot(lh$W_a, type="l", lwd=4, xlab="Age", ylab="Weight (g)")
plot(lh$Mat_l, type="l", lwd=4, xlab="Length (cm)", ylab="Proportion mature")
plot(lh$S_l, type="l", lwd=4, xlab="Length (cm)", ylab="Proportion selected to gear")

Generating data from the simulation model

Using the life history list output from create_lh_list, we can simulate a population and generate data.

The generate_data function has several settings for which the user is required to input a value:

modpath: model path for saving simulated populations; can set as NULL to run within the R environment only without saving locally.
itervec: vector of iterations of simulated data; can be 1 to run only one iteration, or 1:100 to run 100 iterations of simulated populations.
lh: life history list, output from create_lh_list.
Fdynamics: Pattern for fishing mortality dynamics; options include Constant, Ramp, Increasing, None, or Endogenous.
Rdynamics: Pattern for recruitment dynamics; options include Constant, AR (for autocorrelated recruitment), Pulsed, Pulsed_up, or BH (for Beverton-Holt stock-recruit relationship).
Nyears: number of years for your simulated population.
Nyears_comp: number of years to generate length data; e.g. if 10 years and Nyears is 20 years, will be the last 10 years in the 20 year time series.
comp_sample: nominal sample size of length data annually; e.g. 200 length measurements collected annually.
init_depl: initial depletion, or the proportion of the unfished population biomass in the first year of the population to be modeled. Specifying a single value indicates the initial depletion (e.g. 0.5) or two values indicates the lower and upper bounds of a uniform distribution for which the population simulation will randomly draw a depletion value (e.g. c(0.1,0.9)).

There are also several other settings not required but may be useful:

rewrite: default=TRUE to always re-simulated data. rewrite=FALSE may be useful to only simulate a new population if the True.rds output file already exists in the folder.
derive_quants: default=FALSE to save time, changing this argument to TRUE will calculate MSY-based reference points.
pool: default=TRUE, meaning that if the number of seasons per year as specified in the create_lh_list function is more than 1, the length composition data collected in each time set is pooled together annually. pool=FALSE would mean the generated length composition data is on a time step shorter than 1 year (e.g. monthly if nseasons=12). The total sample size for the year will still be equal to comp_sample.
mismatch: default=FALSE, indicating that the length data, catch, and abundance index are all generated from the same years. mismatch=FALSE forces the length data to be collected in separate years from the catch and abundance index, overlapping only 1 year.

Now let's use generate_data to simulate 1 example population (itervec=1) to generate length, catch, and an abundance index. We won't specify a modpath so the simulated population will just be used locally in this vignette.

set.seed(143)

true <- generate_data(modpath=NULL,
                      itervec=1,
                      lh=lh,
                      Fdynamics="Ramp",
                      Rdynamics="AR",
                      Nyears=20,
                      Nyears_comp=20,
                      comp_sample=200,
                      init_depl=0.8)

Plot the simulated population

par(mfrow=c(2,3))
plot(true$F_t, type="l", lwd=4, xlab="Year", ylab="Fishing mortality", ylim=c(0,max(true$F_t)*1.1), xaxs="i", yaxs="i", cex.axis=1.5, cex.lab=1.5)
plot(true$R_t, type="l", lwd=4, xlab="Year", ylab="Recruitment", ylim=c(0,max(true$R_t)*1.1), xaxs="i", yaxs="i", cex.axis=1.5, cex.lab=1.5)
plot(true$SPR_t, type="l", lwd=4, xlab="Year", ylab="SPR", ylim=c(0,1), xaxs="i", yaxs="i", cex.axis=1.5, cex.lab=1.5)
plot(true$ML_t, type="l", lwd=4, xlab="Year", ylab="Mean length", ylim=c(0,max(true$ML_t)*1.1), xaxs="i", yaxs="i", cex.axis=1.5, cex.lab=1.5)
plot(true$SB_t, type="l", lwd=4, xlab="Year", ylab="Spawning biomass", ylim=c(0,max(true$SB_t)*1.1), xaxs="i", yaxs="i", cex.axis=1.5, cex.lab=1.5)
plot(true$D_t, type="l", lwd=4, xlab="Year", ylab="Relative spawning biomass", ylim=c(0,max(true$D_t)*1.1), xaxs="i", yaxs="i", cex.axis=1.5, cex.lab=1.5)

Plot the generated data

Each panel of length data is labeled with the year (1-20).

plot_LCfits(Inputs=list("LF"=true$LF))

par(mfrow=c(1,2))
plot(true$Cw_t, type="l", lwd=4, xlab="Year", ylab="Catch (g)", ylim=c(0,max(true$Cw_t)*1.1), xaxs="i", yaxs="i", cex.axis=1.5, cex.lab=1.5)
plot(true$I_t, type="l", lwd=4, xlab="Year", ylab="Abundance index", ylim=c(0,max(true$I_t)*1.1), xaxs="i", yaxs="i", cex.axis=1.5, cex.lab=1.5)