Getting Started with EpiModelHIV - EpiModel/EpiModeling GitHub Wiki
This tutorial describes the general steps to getting started with a new applied HIV modeling project that use EpiModelHIV.
Project Repository
Our modeling projects always involve two repositories: a project repo and a software repo. The software repository for all HIV modeling projects will be EpiModelHIV-p. This repo is organized with a main branch that contains all the current base model code, and many other branches that contain code for applied projects. Note that the -p in the project name refers to the fact that this is a private repository; we keep the repository private prior to publication of scientific products that use this software tool. Even though the repo name is EpiModelHIV-p, the name of the R package will be just EpiModelHIV.
The project repo contains all the R and other scripts for your project; many of these scripts will call EpiModelHIV in the software repo. An example of a project repo can be found here.
Your first step in creating new applied HIV modeling project will be to create a project repo and test out running EpiModelHIV with it. Follow these steps:
-
Create a new project repo based on the EpiModelHIV-Template. Click the "Use this template" button, select "Create a new repository". When making a new repo name, use something simple and descriptive. Good examples are
SexualDistancingorMoodHIV. Make sure to select "EpiModel" as the owner (this will get the repo in our organization list), add a description for the project (one brief line what it is), and select the Private option. -
Clone this repo by clicking on the green Code button, and select "Open in Github Desktop". Make sure to clone the repository in a location on your computer that is not auto-synced (e.g., don't clone in a Dropbox or OneDrive folder location).
-
Go to this folder on your computer. Find the
EpiModelHIV-Template.Rprojfile, and rename this file to match the name of your repository itself (e.g.,SexualDistancing.Rproj). Double click this.Rprojfile, and a new Rstudio project will be launched. -
Install all the needed packages for the project using the
renvpackage manager. This first requires installingrenvitself if you have never usedrenv(i.e., callinstall.packages("renv")). Next, runrenv::init()in the console window, and select the first option: "Restore the project from lockfile". Hopefully this will all install correctly on your computer. Three common reasons why this set of packages will not install are:- Your version of R does not match the minor version number of R defined by
renv. If a version of R is 4.2.1, then 4 is the major, 2 is the minor, and 1 is the patch version. If we are using 4.2.2 and you are using 4.2.1, then no problem because we share the same minor version (4.2); if you are using 4.1.3, then there is a problem because you minor version is lower. In general, try to keep up to date with the current version of R, trying to match what you see here. - You do not have the necessary compilation tools installed. For example, for Windows it is necessary to install the correct version of Rtools that corresponds to your version of R (see details here). For Macs, you need to install Xcode and GNU Fortran as detailed here.
- You do not yet have access to the ARTnetData package; if this is the case you will likely see an error messsage about ARTnet or ARTnetData, with a general message
The requested URL returned error: 401. This means you have not yet followed the instructions here; do that before proceeding.
- Your version of R does not match the minor version number of R defined by
With those issues resolved, try rerunning renv::init(). Once all of the packages are installed correctly, restart Rstudio (Session menu --> Restart R). renv should provide a message that The project is already synchronized with the lockfile.
- Test out running EpiModelHIV by running R scripts 01, 02, 03, and 04a in the R folder of your project repo. This will estimate the network models (
netest), diagnose them (netdx), and then run a basic HIV transmission model (netsim). Note at the bottom of the 04a file, it provides an example of how to run EpiModelHIV in debug/browser mode; don't run that step quite yet.
Congrats, your project repository is now set up! Keep your Rstudio window with the project repo open for now.
Software Repository
Here we will clone the EpiModelHIV-p repository needed to test and edit your model.
- Clone EpiModelHIV-p. Go to the software repository here, click Code and Open with Github Desktop. Save this in a similar location but a different subfolder than your project repo (e.g., all your github repositories might go in a home directory called
git). - Find this cloned repository locally, and open the
.Rprojfile to open up the new EpiModelHIV project. Now you should have two parallel Rstudio windows, one with EpiModelHIV and one with your project repo. This is how you should plan to work on your code, with these two projects open in parallel. - We won't use
renvin your software repo (only in your project repo). The default libraries for running EpiModelHIV in its own project will follow your system libraries as defined in.libPaths(). To make sure you have all the latest R packages needed to EpiModelHIV outside ofrenv, we recommend you use thepakpackage to quickly set this up. Firstinstall.packages("pak")and then callpak::pak()inside your software repo console. This will ask you to install all the necessary versions of the packages. You can test this out by running (Build --> Install Packages). - Continue your testing, by going back to the 04a file in your project repo and run the lines at the bottom that load the local version of EpiModelHIV (where you just cloned it) and run the model in debug/browser mode. This uses the
pkgload::load_allfunction to "soft load" EpiModelHIV where you have it install locally; note you will need to update the file path in that line. Further details about working with R in debug mode are provided here. This approach will be the general way to internally develop and test your model: working with the scripts in your project repo and the software functions in the software repo, side by side.
Ok, your software repository is now all set up!
Creating a New Branch
Before you make any edits to EpiModelHIV, you will need to make a new branch. A branch is a version of the software that will be unique to your project. There are several ways to do this, but it is easiest with Github Desktop.
- In Github Desktop, in the middle dropdown menu you will see a listing of the current branches. Click "New Branch", and then create one with the same title as your repo (e.g., if you named your project repo
SexualDistancing, then name your new branchSexualDistancing); this name matching is not required but it's a helpful naming convention to keep your project and EpiModelHIV branch straight. - Push this new branch up to Github by clicking the publish. Note your branch will not include any additional changes from the main branch at this point.
- Go back over to your project repo in Rstudio and link your
renv.lockfile with your new branch of EpiModelHIV. You can do this with:remotes::install_github("EpiModel/EpiModelHIV-p@BranchName", force = TRUE), whereBranchNameis your actual branch name. - Save this updated version of EpiModelHIV in
renvwithrenv::snapshot. This will ask you whether you want to update EpiModelHIV, with a message like this:
# GitHub =============================
- EpiModelHIV [ref: main -> BranchName]
Note, you can feel free to update the other packages renv may ask you to update as well. No harm in using all the latest package versions.
Now you have your project and software repos set up, have them linked together locally with pkgload and remotely through renv.
Software Development Exercise
In this exercise, you will add a small update to the aging module in EpiModelHIV, test it locally, and push it to Github. Each module does different processes in the broader HIV model, and the aging module unsurprisingly handles the aging process of nodes. In this exercise, we will add a new summary statistic that calculates the mean age of all nodes and stores it in the correct place.
- Open the
R/mod.aging.Rfile in your software repo. Note the existing structure of the function: getting current attributes and parameters from dat, processes related to aging, then setting the updated data back on dat. - Add a new summary stat line that calculates mean of the age vector with:
dat <- set_epi(dat, "meanAge", at, mean(age))
-
Save the R file in your software repo.
-
Go back to your project repo, and follow the general approach at the bottom of the 04a script:
- Source in the updated version of EpiModelHIV locally with
pkgload, which updates all of the functions in EpiModelHIV for your project. - Rerun the control settings (the
control_msmfunction), which pulls in the updated version of your functions for use innetsim. - Run netsim (don't run the
debugandundebuglines this time) - Examine your new addition to EpiModelHIV:
plot(sim, y = "meanAge")and numerically withas.data.frame(sim)$meanAge. There will not be much change but some.
- Source in the updated version of EpiModelHIV locally with
-
Now, push your change to EpiModelHIV up to Github. Go to Github Desktop, under the EpiModelHIV repository, you will see the edited
mod.aging.Rfile. Commit those changes by adding a commit message "Added meanAge summary stat", then click commit at bottom, and then push the changes up to Github (at top).
Congrats, you have successfully run and edited EpiModelHIV. Now you can make edits that are relevant for your project, test them out, and push them up to Github.
Pull Request on Github
With your first commit to your branch, let's open a pull request to make sure that your branch stays generally updated with main during your development cycle and is passing the necessary error and stylistic checks over time.
- On Github, go to the software repository here. At the top, you should see a green button to open a pull request. Click it, and then create a descriptive title for your pull request. Good examples include a brief title and your name, like: Sexual Distancing Model (Sam's Aim 2).
- Assign yourself to this pull request (on the left) and add the label Applied Project.
- Note that we will never click the "Merge Pull Request" button on the PR (it should actually be disabled for you). Instead, we keep this PR open as a running list of differences between the main branch and your branch.
- You will notice that two automatic checks have started to run. These run every time a commit is made to your branch. One is R CMD Check, which is a general test of the overall EpiModelHIV package (these are the tests that must pass for an R package to be published on CRAN). The other is lintr, which is a stylistic code check; these are tests that check whether your code style is consistent with the team's. Consistent styling is easier to read and less prone to errors!
- You can check these tests once they have completed by clicking on the details link. Note that for lintr, the test will show up as "Sucessful" even if there are violations, so it's a good idea to check this manually from time to time (click details, then the "Lint" line in the middle of the darker window).
- You can run both of these tests locally, which is usually easier to see where the errors are coming from. R CMD Check is equivalent to:
devtools::check()and lintr is equivalent tolintr::lint_package(). Note that you will want to run these in your software repo not your project repo.
Ongoing Development Tasks
As you are developing your EpiModelHIV code, it is always a good idea to check the status of your model relative to the main branch. On a periodic basis (perhaps every one or two weeks), we recommend following these three steps.
- Updating from main: On your pull request, if there are any differences between your branch and upstream changes in the main branch (i.e., changes to the main branch that occurred after you created your branch), you should see an "Update from main" branch at the bottom. You may click that to automatically update those upstream changes; note the directionality of the changes here, you are taking the changes from main and pulling them into your branch (not vice versa!). There are other ways to do this updating. For example, in the Branch menu of Github Desktop, you can click "Update from main". Either way, this will create a new "merge commit" where the two branches are brought in alignment. In some cases, you may see "Merge Conflicts", which are cases where edits to some lines in your branch conflict with edits to the same lines in the main branch. That's ok, it just means these conflicts need to be manually resolved. You can feel free to resolve these yourself, or ask for help in doing so if you get stuck.
- Redocument the package. Any time you make changes to the Roxygen-based documentation in the package (e.g., adding new parameter documentation) make sure that you redocument the package by rebuilding the documentation. In Rstudio, in your software repo, run Build --> Document. If there are any changes to the documentation, you will see updated
.Rdfiles to commit in Github Desktop. - You should periodically check whether your branch of EpiModelHIV is passing the two standard tests (R CMD Check and lintr). You can see this easily on your PR, and then also re-run these tests locally if you see they are not passing. One common reason that R CMD Check may fail is if you have new parameters are defined in one of your modules, but not declared in
param_msm(the parameterization function), or if you have these parameters declared inparam_msmbut no documentation added. With each new parameter you add, make sure you add it toparam_msmwith a default value, and then define it in the documentation.