Writing and Debugging EpiModel Code - EpiModel/EpiModeling GitHub Wiki

This page details one suggested strategy for interactively writing and/or debugging EpiModel code. In practice, it is often helpful to write new EpiModel while inside an EpiModel function, so you can view and test the data structures interactively. R's debugging tools facilitate this, and Rstudio's debugging browsers make this even easier. Here is a general tutorial on various debugging approaches with R/Rstudio; below we present a more specific set of suggestions and instructions for debugging when using EpiModel and related packages.

Reloading Updated EpiModel Code

In our typical approach, we have our EpiModel extension software repository (EpiModelHIV or EpiModelCOVID) and we have a project repository. When writing code, it is helpful to have two Rstudio windows open, with a Rstudio project for each window (this is your XXX.Rproj file in the root directory). When making any updates to the software repo, you then have them pulled into the project repo for them to be "seen" by your project repo scripts. There are two ways of doing this:

Longer-Term Approach

Update the software repo (e.g., add code to a module in EpiModelHIV)
Commit those changes to your branch of the software repo on Github
Switch over to your project repo
Restart Rstudio in the project repo (command-shift-0 on a Mac)
Use renv::update, which will pull the new version of your software repo
Restart Rstudio in the project repo
Run your code from the top of the script
If everything is running well, make sure to update your renv.lock file with the current version of the software package (e.g., EpiModelHIV) with renv::snapshot()

Shorter-Term Approach

Update the software repo (e.g., add code to a module in EpiModelHIV)
Switch over to your project repo
“Soft reload” the updated version of the software with pkgload::load_all(local/path/to/EpiModelHIV). This load_all is a handy function that reloads in place an updated R package, without even the need to restart R. One downside is that the updated version needs to be reloaded in this way if you restart R.
Run the script from the control settings onward. Instead of rerunning your entire script, you only need to rerun the control settings (e.g., control.net or control_msm again), as this ensures any updated module code gets pulled in to netsim simulations.

The benefits of the shorter-term approach is that it is much quicker and flexible, if you are making lots of minor edits or updates and want to test them interactively. But the downsides are that the version of the R package loaded with pkg_load will not carry over if you restart R, or if you switch computers (e.g., from your local computer to the HPC). For that, you will need to use the longer-term approach, which ensures that the specific version of the software gets stored in a way that is accessible to you on other computers or in the future.

Writing Code in Debug/Browser Mode in R

It is helpful to walk through this approach to writing code with a specific example. Let's say we are working on a EpiModelCOVID project and we want to update the disease progression module, which is currently within a function called progress_covid. Here is a listing of our control settings that show all the modules and associated functions:

pkgload::load_all("~/git/EpiModelCOVID")
control <- control.net(nsteps = 100,
                       nsims = 1,
                       ncores = 1,
                       initialize.FUN = init_covid_corporate,
                       aging.FUN = aging_covid,
                       departures.FUN = deaths_covid_corporate,
                       arrivals.FUN = arrival_covid_corporate,
                       resim_nets.FUN = resim_nets_covid_corporate,
                       infection.FUN = infect_covid_corporate,
                       recovery.FUN = progress_covid,
                       dx.FUN = dx_covid,
                       vax.FUN = vax_covid,
                       prevalence.FUN = prevalence_covid_corporate,
                       module.order = c("aging.FUN",
                                        "departures.FUN",
                                        "arrivals.FUN",
                                        "resim_nets.FUN",
                                        "infection.FUN",
                                        "recovery.FUN",
                                        "dx.FUN",
                                        "vax.FUN",
                                        "prevalence.FUN"),
                       resimulate.network = TRUE,
                       tergmLite = TRUE)
sim <- netsim(est, param, init, control)

Here you can see my local location of the EpiModelCOVID package, referenced in that load_all function call. We will use a locally loaded version of the package using load_all because it will be updated a bit in our coding. To start, we run this model as is, and it runs without error out to 100 time steps, as requested and expected.

The debug function

Next, to "see inside" the progression module, we can "debug" that specific function. Before running netsim again, we run:

debug(progress_covid)
sim <- netsim(est, param, init, control)

And now, when the model gets to the progression module, we enter into debug mode. The script for the function pops up:

And in the console, we have some navigation controls for debug mode:

By clicking the Next button, we can progress down through the code, line by line, to see how it is being evaluated.

And notice that as this is happening, the data is defined in the Environment window:

And you can also examine specific objects in the console too, for example by printing out a vector.

You can now edit the code in place in the function script window.

And then continue running the function. Note, when the lines change, the Next function may not continue to work, but you can still execute each subsequent line in the function with Cmd+enter (Mac) or Control+enter (Windows).

When you are ready to exit this view into the function, click the Stop button on the debug mode browser. You can manually stop the debugging of any function with undebug(function_name).

If you run the model again, it will put you back into debugging that older version of the function. If you want to run the code you have updated, load_all the package, run the control settings, then netsim.

The browser Function

An alternative to using the debug function that has some advantages is to use the browser function instead. However, when you use browser, it is necessary to edit the code before running the function to enter into the function. Here is an example:

Let's say we want to start the debugging/editing of our function not at the top, but somewhere in the middle of function. We can write in browser() at the specific point where we want the debug entry to start:

Next, save that file, reload the package with load_all, rerun the control settings, then netsim. This will pop up right into debug model right at that point in the code. And we can test, write, and edit from there. When we are done editing, we need to remove that browser() function call from our code, reload the package, rerun the control settings, and then netsim.

There are a couple advantages to browser over debug:

You can enter into a function at any point, rather than just at the top. This is helpful for long functions so you don't have to click Next a lot of times.
You can use conditional logic to enter into the browser function only when certain conditions apply. For example, perhaps we only want to test our code at time step 25 because the initial conditions at time step 1 or 2 are not ideal for model testing. You can do that with:

if (at == 25) browser()

where at is the time step counter. You can also use any other conditional logic here (e.g., only starting browser if the i.num >= 10 or any other condition).

The recover Option

In many cases with our models, errors are random and unpredictable because the models are stochastic by nature. One time running a model, we might get an error at time step 10 and another time at time step 57. It is time consuming to use debug or browser in these cases because we do not want to keep cycling through the Next button over and over to run the model multiple time steps to see when an error will occur. Fortunately, this is not necessary.

Instead, you can specify what R should do when it encounters an error. For example, here is a random error that I built in. It will run at a random time step, and it will generate an error because there is no parameter named new.param.

Accordingly, it generates this error message:

This is a helpful error message in that it tells what the error is and where it is located. But if we wanted to do something about it, it may be difficult to recreate the conditions that generated the error. In this case, we can tell R that instead of stopping and reporting the error message, it should put us in debug mode at the point of the error. We do this by specifying that option with:

options(error = recover)

That needs to be run before you run netsim. With that, we now get this menu:

	A ERROR occured in module 'recovery.FUN' at step 9
Error in get_param(dat, "new.param") : 
  There is no parameter called `new.param` in the parameter list of the main list object (dat)

Enter a frame number, or 0 to exit   

1: netsim(est, param, init, control)
2: lapply(seq_len(control$nsims), function(s) {
    netsim_loop(x, param, init, control, s)
})
3: FUN(X[i](/EpiModel/EpiModeling/wiki/i), ...)
4: netsim_loop(x, param, init, control, s)
5: withCallingHandlers(expr = {
    if (!is.null(control["initialize.FUN"](/EpiModel/EpiModeling/wiki/"initialize.FUN"))) {
        current_mod <- "initialize.FUN"
        at <- 
6: do.call(mod.FUN, list(dat, at))
7: (function (dat, at) 
{
    active <- get_attr(dat, "active")
    status <- get_attr(dat, "status")
    statusTime <- get_attr(dat, "s
8: mod-progress.R#27: get_param(dat, "new.param")

Selection:

Which is asking us at what layer in the code to start the debugging, in the sense that netsim calls a function which calls a function which calls a function and so on... We want to enter debugging in the layer closest to the error, which is usually but not always, the second to last layer. In this case, that means entering 7 as our selection, which puts us in debug mode right at the point of our error.

From there, we can inspect and modify the code as needed.