Model Evaluation and Selection Steps - utkaln/machine-learning GitHub Wiki

Concept of Bias and Variance

Observation Test Data Error Validation Data Error Bias / Variance
Model is very loosely fitting data (under fitting) High Very High High Bias
Model is extremely tightly fitting with data (over fitting) Low High High Variance
Model is close fit to data Low Low Right Fit

Impact of Degree of Polynomial on Bias and Variance

  • There are several factors to be taken into account to choose the right model and right parameters. This is done by:

    • Splitting data to training, validation and test sets
    • Evaluate how the model performs in regression model vs. classification model
    • Add Polynomial features (higher degree equation) to Linear Regression models to evaluate performance
    • Compare different Neural Network Architectures
  • Also note that the bias or variance should be contrasted only with the baseline level performance. Baseline level performance can be found from the error rate when humans perform the task, or if there is another model using which there was an accepted performance set before

    • If there is significant gap between baseline errors and test errors, that leads to High Bias
    • If there is significant gap between test errors and validation errors that leads to High Variance
    • If there is significant gap between baseline errors and test errors as well as between test errors and validation errors, that leads to High Bias and High Variance

Detect - Bias or Variance

Model Trait Test Data Error Validation Data Error Degree
Bias (Underfit) High High Low
Variance (Overfit) Very Low High High
Right Fit Low Low Middle

Impact of Regularization on Bias and Variance

  • Regularization is about adding a numerical factor to loss function that increases or reduces the effect of parameters (w)
  • If Lambda is chosen to be a very high number (say 10,000), then the effect of all W parameters reduce significantly and the whole equation becomes just b. This will cause high bias or under fitting.
  • On the contrast side if Lambda is chosen to be a very low number (say 0.0001), then the effect of all W parameters will significantly impact the algorithm. This will cause high variance or over fitting.
Model Trait Test Data Error Validation Data Error Lambda
Bias (Underfit) High High Very High
Variance (Overfit) Very Low High Very Low
Right Fit Low Low Middle

bias-variance-reg v1

Reduce Bias and Variance

Options Concern Addressed
Get more training examples High Variance
Try with smaller set features High Variance
Get additional features High Bias
Add polynomial degrees High Bias
Decrease Lambda (Regularization factor) High Bias
Increase Lambda (Regularization factor) High Variance

Reduce Bias / Variance in Neural Network

bias-variance-reg-Page-2

  • Note: Larger Network always performs better than smaller network as long as regularization factor is chosen correctly