run_model - PNHP/Regional_SDM GitHub Wiki

The script that runs the random forest model would have these components

  1. bring in the presence points and the background points
  2. create an initial model to look at variable importance. Dump the least important variables (lowest 25% right now).
  3. checking for the number of groups (polygons) and making accuracy assessment decisions based on this (stratify by how many groups?)
  4. tune mtry
  5. runs a series of models (jack-knife routine), saves out the results. This is external validation.
  6. completes the final model using all presence points.
  7. calculates cutoff information, using one or more approaches, saves out the results.
  8. calculates partial plot information for metadata.
  9. saves the final model for possible retrieval later.