Frequently Asked Questions - jsilve24/fido GitHub Wiki

std::bad_alloc error when inferring a model in fido

While there are a number of possible causes of this, it is most commonly due to a lack of RAM on your computer. The most memory intensive component is often storing the Hessian used for the Laplace approximation. A crude method for calculating your memory requirement 11*(N*D)^2/(10^9) (N=number of samples, D=number of multinomial categories; given in gigabytes). If you don't have enough memory fido has an alternative uncertainty quantification algorithm called the Multinomial-Dirichlet Bootstrap. Basically its multinomial-dirichlet draws centered on the MAP estimate. Pretend the following line is your call to fit a pibble model:

fit <- pibble(Y, X)

you can change this by adding:

fit <- pibble(Y, X, multDirichletBoot=0.5, calcGradHess=FALSE)

The parameter multDirichletBoot can be any scalar >0 and represents the prior parameters for the Dirichlet (think of it like a pseudo-count; so values >1 are probably not what you want). Note: Currently only available on development branch.

Another option is to just use the MAP estimate without uncertainty quantification. This is probably fine for some people who are using fido just for hypothesis generation:

fit <- pibble(Y, X, n_samples=0, calcGradHess=FALSE)

Rarely you could have memory problems because you are trying to produce too many posterior samples (n_samples is too big for the memory on your computer). There are a number of ways around this but I have yet to really hear of people run into this problem (so I will write up those solutions later).

Warnings about Large Negative Eigenvalues or "Laplace Approximation Failed"

You may have seen the following warning message or some variation on it:

Warning messages:
1: In optimPibbleCollapsed(Y, upsilon, Theta %*% X, KInv, AInv, init,  :
  Cholesky of Hessian failed with status status Eigen::NumericalIssue
2: In optimPibbleCollapsed(Y, upsilon, Theta %*% X, KInv, AInv, init,  :
  Decomposition of Hessian Failed, returning MAP Estimate only
3: In (function (Y = NULL, X = NULL, upsilon = NULL, Theta = NULL,  :
  Laplace Approximation Failed, using MAP estimate of eta to obtain Posterior mean of Lambda and Sigma (i.e., not sampling from the posterior distribution of Lambda or Sigma)

Here is what's happening: The optimization did not end close enough to the true optima for eta and as a the Laplace approximation ended up running into some numerical issues. As a result, what fido does by default is return just the MAP estimate (a single "sample"). Now this can happen "randomly", i.e., that the initial values of the optimization were just sub-par and as a result, the optimization path was less than ideal. More often however I find that these numerical issues come up when priors are very much at odds with the posterior (e.g., when your prior beliefs are way off compared to what the data is saying). In these cases, the Laplace approximation can have a very narrow margin for error, if the optimizer does not land exactly on the MAP estimate you can end up having a poorly conditioned posterior hessian matrix. Here are a few steps I find typically solve this problem:

  1. Try restarting the optimization using a different initial guess for eta (the init argument to any of the fitting functions)
  2. Try decreasing (or even increasing )step_size (by increments of 0.001 or 0.002) and increasing max_iter parameters in optimizer. Also can try increasing b1 to 0.99 and decreasing eps_f by a few orders of magnitude (step_size argument to any of the fitting functions - default is 0.003)
  3. Try relaxing prior assumptions regarding covariance matrix. (e.g., may want to consider decreasing parameter upsilon closer to a minimum value of D)
  4. Try adding small amount of jitter (e.g., set jitter=1e-5) to address potential floating point errors. (jitter argument to any of the fitting functions)