05 Generalized Additive Model (GAM) - chanchishing/Generalized-Linear-Models-and-Nonparametric-Regression GitHub Wiki

Generalized Additive Model (GAM)

We can think of our response $Y$ is the output of a function taking all the predictors as the input:

$$ \begin{alignat*}{4} && Y_i &= f(x_{i,1},\cdots,x_{p,1}) && \ && Y_i &= f(\vec{x}_{i}) + \epsilon_i && \text{where $\vec{x}_i$ is the i-th covariate class}\ \end{alignat*} $$

We can estimate $f(\vec{x})$ as below, which is very similar to the Kenrel estimator of a single variable in Kernel-Smooth

$$ \begin{alignat*}{4} && \hat{f}(\vec{x}) &= \sum\limits_{i=1}^{n} \dfrac{K_H(\vec{x}-\vec{x}_i)y_i} { \sum\limits_i^n K_H(\vec{x}-\vec{x}_i) } && \ && \text{where} & \text{$K_H(\vec{x}-\vec{x}_i)$ is a multivariate Kernel and $K_H$ is dependent on $H$,} && \ && & \text{$H$ is a positive definite matrix, which means $H^{-1}$ and $H^{\frac{1}{2}}$ exists and $|H|$ is the determinant of $H$} && \ && & \text{$K_H(\vec{u})=|H|^{-\frac{1}{2}} K(\vec{z})$ where $z=H^{-\frac{1}{2}}\vec{u}$} && \
&& & K(\vec{z})>0 \text{ and } \int K(\vec{z})d\vec{z}=1 && \
\end{alignat*} $$

$\hat{f}(\vec{x})$ can be estimated but it is complicated.

In Generalized Additive Model (GAM), we simplify the above and assume the expectation of response is depend on the sum of individual function applied on each predictors and the function for each predictors can be different and non-linear.

$$ \begin{alignat*}{4} && E(Y_i) &= \beta_0 + f_1(x_{i,1}) + f_2(x_{i,2}) + \cdots + f_p(x_{i,p}) &&\ \end{alignat*} $$

Note: the above can be easily be modified to adapt to different situation of study:

$$ \begin{alignat*}{4} && E(Y_i) &= \beta_0 + \beta_1 x_{i,1} + f_2(x_{i,2}) + \cdots + f_p(x_{i,p}) &&\text{$x_1$ is categorical or linearly related to the response}\ && E(Y_i) &= \beta_0 + \beta_1 x_{i,1} + f_2(x_{i,2}) + f_{1,2}(x_{i,1}x_{i,2}) &&\text{$x_1$ and $x_2$ has interaction}\ && log(\lambda_i) &= \beta_0 + \beta_1 x_{i,1} + f_2(x_{i,2}) + \cdots + f_p(x_{i,p}) &&\text{adapt to non-normal response similar to GLM (for example Poisson)}\ \end{alignat*} $$

Effective degrees of freedom of GAM

In GAM, the effective degrees of freedom (edf) is analogous to the degrees of freedom of standard linear regression and it is reported in the summary() method in R of the a GAM model (using the mgcv package). If a smooth term has an edf "close to 1" in a GAM model then that term should enter linearly into the model. Note what "close to 1" means in the GAM context, there is no hard definitive cut off value. So it's always a good idea to accompany this numerical evaluation with a graphical one, such as a plot of the smooth.

So if we look at the marginal relationship between a predictor and the smooth, if we see something that looks roughly linear (or we could draw a straight line within the confidence band of the smooth) and we also have an edf close to 1, both of these things together should provide some evidence the predictor should enter the model in a linear way.

Hypothesis tests (F-tests) and $R^2$ in GAM summary() Output

The summary() output of mgcv package of a GAM model provides the approximate F-test p-value to test whether the smooth terms should stay smoothed or not.

$$ \begin{alignat*}{4} && H_0 &: \text{the given smooth term is zero (i.e. it is linear)} && vs \ && H_1 &: \text{the given smooth term is not zero (i.e. it is non-linear)} &&\ \end{alignat*} $$

if a p-value associated with a smooth term is extremely small that would provide some evidence that the term should stay smoothed rather than linear (reject $H_0$)

The GAM summary() output also gives the $R^2$ (deviance explained) and $R^2_a$ (adjusted deviance explained), which can be interpret in same say as standard linear model.