Model Evaluation and Selection Steps - utkaln/machine-learning GitHub Wiki
Concept of Bias and Variance
Observation | Test Data Error | Validation Data Error | Bias / Variance |
---|---|---|---|
Model is very loosely fitting data (under fitting) | High | Very High | High Bias |
Model is extremely tightly fitting with data (over fitting) | Low | High | High Variance |
Model is close fit to data | Low | Low | Right Fit |
Impact of Degree of Polynomial on Bias and Variance
-
There are several factors to be taken into account to choose the right model and right parameters. This is done by:
- Splitting data to
training
,validation
andtest
sets - Evaluate how the model performs in
regression
model vs.classification
model - Add Polynomial features (higher degree equation) to Linear Regression models to evaluate performance
- Compare different Neural Network Architectures
- Splitting data to
-
Also note that the bias or variance should be contrasted only with the baseline level performance. Baseline level performance can be found from the error rate when humans perform the task, or if there is another model using which there was an accepted performance set before
- If there is significant gap between
baseline errors
andtest errors
, that leads toHigh Bias
- If there is significant gap between
test errors
andvalidation errors
that leads toHigh Variance
- If there is significant gap between
baseline errors
andtest errors
as well as betweentest errors
andvalidation errors
, that leads toHigh Bias
andHigh Variance
- If there is significant gap between
Detect - Bias or Variance
Model Trait | Test Data Error | Validation Data Error | Degree |
---|---|---|---|
Bias (Underfit) | High | High | Low |
Variance (Overfit) | Very Low | High | High |
Right Fit | Low | Low | Middle |
Impact of Regularization on Bias and Variance
- Regularization is about adding a numerical factor to loss function that increases or reduces the effect of parameters (w)
- If
Lambda
is chosen to be a very high number (say 10,000), then the effect of allW
parameters reduce significantly and the whole equation becomes justb
. This will causehigh bias
or under fitting. - On the contrast side if
Lambda
is chosen to be a very low number (say 0.0001), then the effect of allW
parameters will significantly impact the algorithm. This will causehigh variance
or over fitting.
Model Trait | Test Data Error | Validation Data Error | Lambda |
---|---|---|---|
Bias (Underfit) | High | High | Very High |
Variance (Overfit) | Very Low | High | Very Low |
Right Fit | Low | Low | Middle |
Reduce Bias and Variance
Options | Concern Addressed |
---|---|
Get more training examples | High Variance |
Try with smaller set features | High Variance |
Get additional features | High Bias |
Add polynomial degrees | High Bias |
Decrease Lambda (Regularization factor) | High Bias |
Increase Lambda (Regularization factor) | High Variance |
Reduce Bias / Variance in Neural Network
- Note: Larger Network always performs better than smaller network as long as regularization factor is chosen correctly