test validation datasets - taoualiw/My-Knowledge-Base GitHub Wiki
-
A validation dataset is a sample of data held back from training your model that is used to give an estimate of model skill while tuning model’s hyperparameters. It is used in the model training phase to avoid overfitting , If the accuracy over the training data set increases, but the accuracy over the validation data set stays the same or decreases, then you're overfitting your neural network and you should stop training.
-
A test dataset that is also held back from the training of the model, but is instead used to give an unbiased estimate of the skill of the final tuned model when comparing or selecting between final models. This data set is used only for testing the final solution in order to confirm the actual predictive power of the network.
The thing is, in practice the "parameters" of the training method aren't the only thing you need to specify for a learning example. You also have hyperparameters. Now, those hyperparameters might be an explicit part of the model fitting (like learning rate), but you can also view other choices as "hyperparameters": do you choose an SVM or a neural network? If you implement early stopping, at what point do you stop?. Just like overfitting of the parameters on the training set, you can overfit the hyperparameters to the validation set.
Hence the common practice of using an independent test set separate from your training (modeling) and validation (picking a model, features, hyperparameters, etc.) set.
# split data
data = ...
train, validation, test = split(data)
# tune model hyperparameters
parameters = ...
for params in parameters:
model = fit(train, params)
skill = evaluate(model, validation)
# evaluate final model for comparison with other models
model = fit(train)
skill = evaluate(model, test)