Interfaces 3.0 Spec - PrinceWangR/rstan GitHub Wiki

DEPRECATED

New page: User Interface Guidelines for Developers


Original proposal from Andrew and Bob:

https://groups.google.com/d/msg/stan-dev/k6f1OyhcaOQ/5lax1C9YXkQJ

Some of the intermediate objects we will need

  • foo.stan : text file (or string in R?)
    Stan program
  • foo.so : a shared object (Windows: dynamic linked library) file
    (Windows: foo.dll) compiled model:
  • foo.instantiated : R object
    model instance (with data)
  • foo.rda : file in R data formats storing a single serialized R object
    serialized model
  • foo.adapt : custom R object
    adaptation information for 1 chain
  • foo.fit.single : custom R object
    fitted model for 1 chain
    Note: harmonize variable names with Stan, chains -> num_chains, warmup -> num_warmup, n_save -> num_save
  • foo.fit.multiple : custom R object
    fitted model for multiple chains
    Note: harmonize variable names with Stan
  • foo.fit : custom R object
    fitted model, with simulations from multiple chains and also diagnostic information, seeds, and everything else that is needed to replicate and evaluate the fit
    Note: harmonize variable names with Stan
  • foo.mle.fit : custom R object
    fitted model from Stan's point estimation
  • foo.coda : custom R object
    posterior samples, etc., independent of model

QUESTIONS

  1. Do we allow a model to be specified via a string rather than a file? Lots of R users seem to take this approach, and it allows a certain degree of automation of models, but it'll be a bit of a pain to support with make-like behavior.
  2. How do foo.fit.single, foo.fit.multiple and foo.fit relate to each other? (In particular, chains need to be stored in memory contiguously if they are to be converted without a copy into an Eigen Matrix for Stan's consumption---e.g., for split_rhat calcuations)
  3. Do the fit objects include the model?
  4. How does the foo.mle.fit relate to the MCMC fit objects, if at all?
  5. What to do about the seeds in the low-level functions so that pasting them all together gives the same result as higher-level functions?

Main low-level functions

  • foo.so <- stan_compile(“foo.stan”, boost_lib, eigen_lib, verbosity): returns a compiled Stan program
  • foo.instantiated <- stan_instantiate(foo.so, data, verbosity): returns an instantiated Stan program (i.e., compiled program with data, so it's all good to go); also need to be able to initialize from random inits with specified interval or initialize with all 0 values
  • foo.adapt <- stan_adapt(foo.instantiated, algorithm(sub_arguments), init, seed, chain_id, n_warmup, verbosity): does the adaptation for one chain using the specified instantiated model, returns a list with the relevant state variables (values of all the parameters, step size, mass matrix, anything else that is needed for the specified algorithm to go) and also various diagnostics (number of leapfrog steps per iteration, Metropolis acceptance rates, anything else that might be useful)
  • foo.fit.single <- stan_simulate(foo.instantiated, foo.adapt, algorithm(sub_arguments), parameters_to_save, parameters_to_not_save, init, seed, chain_id, n_iter, verbosity): runs one chain using the specified instantiated model, returns a list including the simulated parameter values (in matrix or list form, I don't know which is easier) along with various diagnostics (number of leapfrog steps per iteration, Metropolis acceptance rates, anything else that might be useful)
  • foo.fit.multiple <- stan_combine(list(foo.fit.single.1,foo.fit.single.2,...), verbosity): combines several chains into a single object (a list or a 3-way array, I don't know which is easier)

Other low-level functions

  • foo.point.estimate <- stan_optimize(foo.instantiated, algorithm(sub_arguments), parameters_to_save, parameters_to_not_save, init, seed, verbosity): returns a list with parameter estimates, standard errors, covariance matrix, and diagnostic information from the fit.
  • parameters.new <- stan_transform(foo.instantiated, parameters) to transform forward and back
  • log.prob <- stan_logprob(foo.instantiated, parameters, format)
  • gradient <- stan_gradient(foo.instantiated, parameters, format)
  • hessian <- stan_hessian(foo.instantiated, parameters, format) In the above 3 functions, the format argument tells whether these are on the original or transformed scale, also they could be supplied as a list of named parameters or as a vector with all of them strung together.

The make function:

stan_make (“foo.stan”, data, file_save) This creates (if necessary) the compiled model foo.so in the R workspace, and, if file_save = TRUE, also saves it as foo.rda. If data are supplied, this function call also creates (if necessary) the instantiated model and saves it as foo.instantiated in the R workspace. This function can also have a bunch of optional arguments telling it what directory to use and what names to call the R object and the files.

Workhorse user-facing functions:

  • foo.fit <- stan(“foo.stan”, algorithm(sub_arguments), parallelism, parameters_to_save, parameters_to_not_save, init, seed, n_iter, n_warmup, thin, n_chains, verbosity), returns a list with the multiple chains object and also the diagnostics returned from the adaptation and the run for each chain.
  • foo.mle.fit <- stan_mle(“foo.stan”, algorithm(sub_arguments), parameters_to_save, parameters_to_not_save, init, seed, verbosity), actually I'm not sure now how this differs from stan_optimize. We'll have various bells and whistles but they'll be going into the extractor functions anyway.

Extractor functions for mcmc

  • stan_sims_list(foo.fit.multiple), returns a named list with one element for each parameters (or generated quantity) returned.
  • stan_sims_array(foo.fit.multiple), returns a 3-dimensional array of iterations x chains x scalar parameters
  • stan_sims_matrix(foo.fit.multiple), returns a matrix of simulations x scalar parameters.
  • stan_summary(foo.sims.array, quantiles), returns a matrix whose rows are scalar parameters and whose columns are "mean", "se_mean", "sd", "quantile_0.1", "quantile_0.5", "quantile_0.9", "n_eff", "R_hat" (here I'm assuming that quantiles=c(.1,.5,.9)).
  • stan_waic_xval(foo.sims.array), returns a list of waic, elpd_waic, p_waic, elpd_loo, p_loo, waic_summary_pointwise, waic_summary_total, and waic_summary_se. (For details of what these are, see the function on page 14-15 of this paper: http://www.stat.columbia.edu/~gelman/research/unpublished/waic_stan.pdf). The function will only works if one of the returned parameters in foo.fit.multiple is something called log_lik. If there is no log_lik, it can return NULL.

To get a sense of what these functions return, consider an example in which the fitted model returns two parameters, alpha which is a scalar, beta which is a 2x4 matrix, and lp__ which is a scalar. And suppose that we have 4 chains with 500 iterations saved per chain. Then the above functions will return the following things:

  • stan_sims_list() returns a list with three elements: "alpha" which is a vector of length 2000, "beta" which is a 2000x2x4 array, and "lp__" which is a vector of length 2000. The iterations will have been scrambled using the scramble ordering saved in the object foo.fit.multiple.
  • stan_sims_array() returns a 500x4x10 array corresponding to the 500 iterations (in order), the 4 chains, and the 10 parameters (with names "alpha", "beta[1,1]", ..., "beta[2,4]", "lp__").
  • stan_sims_matrix() returns a 200x10 array corresponding to the 2000 iterations (scrambled in the same order as those in stan_sims_list) and the 10 parameters (with names "alpha", "beta[1,1]", ..., "beta[2,4]", "lp__").
  • stan_summary() returns a 10x8 matrix whose rows are "alpha", "beta[1,1]", ..., "beta[2,4]", "lp__" and whose columns are "mean", "se_mean", "sd", "quantile_0.1", "quantile_0.5", "quantile_0.9", "n_eff", "R_hat"

Extractor functions for point estimation

  • stan_estimate(foo.fit.mle), returns a vector or list of point estimates.
  • stan_se(foo.fit.mle), returns a vector or list of standard errors.
  • stan_mle_sims(foo.fit.mle, algorithm(sub_arguments)), returns a list or matrix of simulations based on an approximate model fit. Here the algorithm could be normal approx, stabilized importance sampling, or wedge sampling.

Ben's counter-proposal

The basic idea here is to use reference classes http://stat.ethz.ch/R-manual/R-devel/library/methods/html/refClass.html so that the calling is as similar as possible to PyStan https://pystan.readthedocs.org/en/latest/ with the inevitable exception that R uses a $ where Python and every other language uses a .

Objects

Just foo, which can be constructed via something like stan_make() or stan() in Andrew / Bob's proposal

Fields

  • dso [cxxdso]
  • options [list]
  • model_name [character]
  • model_code [character]
  • model_cpp [list]
  • model_pars [character]
  • par_dims [list]
  • sims [list]
  • inits [list]
  • date [character]
  • data [list]

Main low-level methods

  • foo$initialize(file, model_code)
  • foo$compile()
  • foo$set_data(list)
  • foo$adapt(warmup)
  • foo$sample(iterations)
  • foo$set_option(silent = TRUE, write_csv = FALSE) or whatever

Other low-level methods

  • foo$maximize() more descriptive than optimize()
  • foo$logprob(theta) by default theta would be the mode if foo$maximize() has been called
  • foo$gradient(theta)
  • foo$hessian(theta)

Extractor methods

  • foo$get_sims(unbounded = FALSE)
  • foo$get_diagnostics()
  • foo$get_point_estimates(unbounded = FALSE)
  • foo$get_se(unbounded = FALSE)
  • foo$get_sims(unbounded = FALSE)
⚠️ **GitHub.com Fallback** ⚠️