Concepts of Fitness and Regularization - utkaln/machine-learning GitHub Wiki
Underfit | Overfit |
---|---|
Predictions skips lot of training targets | Prediction catches almost all of training targets |
High Bias | High Variance |
Options to address Overfitting
- Collect more data, this will help to generalize
- Use fewer features with intuition of what are good features
- Regularization - Reduce size of selective parameters to reduce the effect of some of the features instead of choice 2 that suggests completely removing the features
Regularization
- The general philosophy about regularization is to reduce the impact of all the parameters enough to smoothen out overfitting
- This can be achieved by adding additional value to cost function, that is just about enough to smoothen the overfitting
- Preferred option to address overfitting
- reduce the value of some parameters (w[j]) to reduce the effect of some of the features
- Usual practice is to not reduce value of b
Cost Function with Regularization - Linear Regression / Logistic Regression
- The cost function is same as above with the only difference is that the function f is a sigmoid function in Logistic Regression
Gradient Descent with Regularization - Linear Regression / Logistic Regression
- Repeat until converge
Partial derivatives required for Gradient Descent
- Gradient calculation (partial derivatives) are almost identical between Linear and Logistic regression with only difference is how function
f(x)
is calculated
Regularization in Neural Network
Regularized Model:
L1 = Dense(units=20, activation='relu', kernel_regularizer=L2(0.01))