Concepts of Fitness and Regularization - utkaln/machine-learning GitHub Wiki

Underfit	Overfit
Predictions skips lot of training targets	Prediction catches almost all of training targets
High Bias	High Variance

Options to address Overfitting

Collect more data, this will help to generalize
Use fewer features with intuition of what are good features
Regularization - Reduce size of selective parameters to reduce the effect of some of the features instead of choice 2 that suggests completely removing the features

Regularization

The general philosophy about regularization is to reduce the impact of all the parameters enough to smoothen out overfitting
This can be achieved by adding additional value to cost function, that is just about enough to smoothen the overfitting
Preferred option to address overfitting
reduce the value of some parameters (w[j]) to reduce the effect of some of the features
Usual practice is to not reduce value of b

Cost Function with Regularization - Linear Regression / Logistic Regression

The cost function is same as above with the only difference is that the function f is a sigmoid function in Logistic Regression

Gradient Descent with Regularization - Linear Regression / Logistic Regression

Repeat until converge

Partial derivatives required for Gradient Descent

Gradient calculation (partial derivatives) are almost identical between Linear and Logistic regression with only difference is how function f(x) is calculated

Regularization in Neural Network

Regularized Model: L1 = Dense(units=20, activation='relu', kernel_regularizer=L2(0.01))