Overfit vs. Underfit - SoojungHong/MachineLearning GitHub Wiki
1
Question : How can you tell that your model is overfitting or underfitting the data?
Answer :
You used cross-validation to get an estimate of a model’s generalization performance. If a model performs well on the training data but generalizes poorly according to the cross-validation metrics, then your model is overfitting. If it performs poorly on both, then it is underfitting. This is one way to tell when a model is too simple or too complex.
Another way is checking the learning curve. These are plots of the model’s performance on the training set and the validation set as a function of the training set size (or the training iteration).
2
Question : How to improve underfitting and overfitting?
For the case of underfitting, If your model is underfitting the training data, adding more training examples will not help. You need to use a more complex model or come up with better features.
For the case of overfitting, One way to improve an overfitting model is to feed it more training data until the validation error reaches the training error.