Model Evaluation and Selection Steps - utkaln/machine-learning GitHub Wiki
Concept of Bias and Variance
| Observation | Test Data Error | Validation Data Error | Bias / Variance |
|---|---|---|---|
| Model is very loosely fitting data (under fitting) | High | Very High | High Bias |
| Model is extremely tightly fitting with data (over fitting) | Low | High | High Variance |
| Model is close fit to data | Low | Low | Right Fit |
Impact of Degree of Polynomial on Bias and Variance
-
There are several factors to be taken into account to choose the right model and right parameters. This is done by:
- Splitting data to
training,validationandtestsets - Evaluate how the model performs in
regressionmodel vs.classificationmodel - Add Polynomial features (higher degree equation) to Linear Regression models to evaluate performance
- Compare different Neural Network Architectures
- Splitting data to
-
Also note that the bias or variance should be contrasted only with the baseline level performance. Baseline level performance can be found from the error rate when humans perform the task, or if there is another model using which there was an accepted performance set before
- If there is significant gap between
baseline errorsandtest errors, that leads toHigh Bias - If there is significant gap between
test errorsandvalidation errorsthat leads toHigh Variance - If there is significant gap between
baseline errorsandtest errorsas well as betweentest errorsandvalidation errors, that leads toHigh BiasandHigh Variance
- If there is significant gap between
Detect - Bias or Variance
| Model Trait | Test Data Error | Validation Data Error | Degree |
|---|---|---|---|
| Bias (Underfit) | High | High | Low |
| Variance (Overfit) | Very Low | High | High |
| Right Fit | Low | Low | Middle |
Impact of Regularization on Bias and Variance
- Regularization is about adding a numerical factor to loss function that increases or reduces the effect of parameters (w)
- If
Lambdais chosen to be a very high number (say 10,000), then the effect of allWparameters reduce significantly and the whole equation becomes justb. This will causehigh biasor under fitting. - On the contrast side if
Lambdais chosen to be a very low number (say 0.0001), then the effect of allWparameters will significantly impact the algorithm. This will causehigh varianceor over fitting.
| Model Trait | Test Data Error | Validation Data Error | Lambda |
|---|---|---|---|
| Bias (Underfit) | High | High | Very High |
| Variance (Overfit) | Very Low | High | Very Low |
| Right Fit | Low | Low | Middle |

Reduce Bias and Variance
| Options | Concern Addressed |
|---|---|
| Get more training examples | High Variance |
| Try with smaller set features | High Variance |
| Get additional features | High Bias |
| Add polynomial degrees | High Bias |
| Decrease Lambda (Regularization factor) | High Bias |
| Increase Lambda (Regularization factor) | High Variance |
Reduce Bias / Variance in Neural Network

- Note: Larger Network always performs better than smaller network as long as regularization factor is chosen correctly