influence function - chunhualiao/public-docs GitHub Wiki
Here’s what that line is saying, piece by piece.
What are influence functions?
In robust statistics, an influence function measures the causal effect of making a tiny change to the data distribution on some estimator. In modern ML (Koh & Liang, 2017), we use it to approximate how much a single training example $z$ would change a test loss (or a test prediction) if we slightly up-weighted that example during training.
The formula
$$ \mathcal I_{\text{up},\text{loss}}(z, z_{\text{test}}) = -,\nabla_{\theta} L(z_{\text{test}}, \hat{\theta})^{\top}; H_{\hat{\theta}}^{-1}; \nabla_{\theta} L(z, \hat{\theta}) $$
What each term means
- $z=(x,y)$: a training point.
- $z_{\text{test}}$: a test point.
- $L(z,\theta)$: per-example loss (e.g., cross-entropy).
- $\hat{\theta}$: parameters after training (ERM minimizer).
- $\nabla_{\theta} L(z, \hat{\theta})$: gradient of the training example’s loss at $\hat{\theta}$ — the direction that example wants to push the parameters.
- $H_{\hat{\theta}}$: the Hessian of the average training loss at $\hat{\theta}$ (i.e., curvature of the objective around the solution).
- $H_{\hat{\theta}}^{-1}$: rescales those directions by the local curvature (flat directions are amplified; high-curvature directions are damped).
- $\nabla_{\theta} L(z_{\text{test}}, \hat{\theta})^{\top}$: says “how does a small parameter change affect the test loss?”
Interpretation
-
$\mathcal I_{\text{up},\text{loss}}(z, z_{\text{test}})$ is the first-order change in the test loss at $z_{\text{test}}$ if you up-weight the training point $z$ by an infinitesimal amount during training.
-
Sign:
- Positive → up-weighting $z$ would increase the test loss at $z_{\text{test}}$ (harmful).
- Negative → it would decrease the test loss (helpful).
-
Magnitude: strength of that effect.
Why the minus sign? If you up-weight $z$, the ERM optimum $\hat{\theta}$ moves in direction
$$ \frac{d\hat{\theta}}{d\epsilon}\Big|{\epsilon=0} = -,H{\hat{\theta}}^{-1}\nabla_{\theta} L(z,\hat{\theta}). $$
Propagating this parameter change to the test loss via the chain rule gives the formula above.
Common variants
- Influence on parameters: $\mathcal I_{\text{up},\theta}(z) = -H_{\hat{\theta}}^{-1}\nabla_{\theta} L(z,\hat{\theta})$.
- Approx. leave-one-out effect: removing $z$ changes test loss by about $\frac{1}{n}\nabla_{\theta} L(z_{\text{test}},\hat{\theta})^{\top} H_{\hat{\theta}}^{-1}\nabla_{\theta} L(z,\hat{\theta})$.
Assumptions & practice notes
- Loss is twice differentiable; $H_{\hat{\theta}}$ is (locally) invertible/PD.
- Deep nets violate strict convexity; people use damping $(H+\lambda I)^{-1}$ and compute Hessian–vector products (CG/LiSSA) to avoid forming $H$.
- Works best near the trained optimum and for small perturbations.
Takeaway The score is a fast, first-order causal attribution from training points to test performance: it tells you which training examples help or hurt a given test example (or an entire test set, by summing over $z_{\text{test}}$).