BayesianInformationCriterion - crowlogic/arb4j GitHub Wiki

The Bayesian InformationCriterion (BIC) is a criterion for model selection in statistical inference, which balances the goodness of fit of a model with its complexity. The BIC is defined as:

$$\text{BIC} = p \log n -2 \log \mathcal{L}$$

where

  • $\mathcal{L}$ is the maximized value of the likelihood function of the model,
  • $p$ is the number of parameters in the model,
  • and $n$ is the sample size.

The BIC is based on Bayesian principles, and it can be derived as an approximation to the marginal likelihood of the model given the data, under a uniform prior distribution over the parameter space. The BIC can be interpreted as an estimator of the Kullback-Leibler divergence between the true data-generating process and the model, with a penalty for model complexity that increases with the number of parameters.

The BIC is commonly used in model selection problems where the goal is to choose the model that maximizes the likelihood of the data, subject to a constraint on model complexity. The BIC provides a way to compare models with different numbers of parameters, and it favors models that fit the data well, but are not overly complex.

A lower value of the BIC indicates a better trade-off between goodness of fit and model complexity, and it can be used to rank models in order of their relative likelihood given the data. However, the BIC should not be used in isolation, and other criteria, such as cross-validation, should also be considered to ensure the robustness of the model selection process.

In summary, the Bayesian Information Criterion is a criterion for model selection that balances the goodness of fit of a model with its complexity, by penalizing the number of parameters in the model. The BIC is based on Bayesian principles and can be interpreted as an estimator of the Kullback-Leibler divergence between the true data-generating process and the model. The BIC can be used to compare models with different numbers of parameters and rank them in order of their relative likelihood given the data.