Skip to content

Glossary

Romain F. Laine edited this page Mar 20, 2020 · 9 revisions
Term Our definition of it
Augmentation This is a common step in data preprocessing for DL applications. This describes the process of increasing the amount of available training data which can improve the performance of the network during training and prevent the network from overfitting. Augmentation can take different forms, such as rotating, shifting or cropping images from the training dataset.
Batches As training datasets are sometimes quite large they often cannot be loaded in one piece into a neural network. Hence, datasets are often split into batches of data and the network 'sees' only one batch per training step.
Epoch An epoch is a round of training of a network during which the network should see the entire training dataset once. Usually, a sequence of epochs are running sequentially during which a subset of the training dataset is used to improve the network performance. Typically, at the end of each epoch the loss function is calculated on the validation dataset.
Inference This is the step in which we use a trained network to perform a prediction on an unseen test dataset.
Input An image given to the network for training or inference. Also referred to as source or signal.
Loss The output of the loss function
Loss function The loss function is a mathematical function which gives a quantitative estimate of the error between network output and its ground-truth target. This quantitative comparison is essential for the network to propagate the observed errors back through the network and improve its performance.
Model A specific neural network and often used equivalently to 'neural network'.
Neural network An algorithmic architecture the structure of which was inspired by the network of connected nodes and communication channels in the brain.
Notebook A notebook refers to a Jupiter notebook. They allow the running of Python code, in the online environment provided by Google Colab
Output The image the network creates given an input image. For testing, we also call this a 'prediction'.
Overfitting This occurs when a network learns the transformation defined by the training set almost completely. This can occur when the network is trained for too many epochs on insufficient data. When this happens, inference performance becomes unreliable as the network cannot generalize what it learned in the training dataset to new data. Avoiding overfitting is critical to achieving a high-fidelity DL network. Overfitting can be difficult to detect but a good way to avoid it is to compare a network's validation performance with its training performance: When the validation loss increases and diverges from the training loss, the network likely overfits to the given dataset.
Patches Some networks (such as CARE) split input images into individual patches, i.e. defined subregions of the image, which can be useful to increase the total training dataset when the number of raw data files is small.
Steps In one step a neural network will usually update its inner parameters once based on a specific input, usually a batch of inputs-target pairs. Several steps make up an epoch.
Test Dataset A dataset containing images which are not in the training dataset. This set is not to be confused with the validation dataset which is automatically created during training. The test dataset can be supplemented with ground-truth images to inspect the network's performance after training.
Training The training of a network is the stage at which the network is presented with a so-called training dataset and from which the network learns to perform the task at stake efficiently. During training, the network is able to adjust its own internal parameters and improve its performance on a given task.
Training dataset The training dataset is a dataset that allows the network to understand the task (the transformation) that is expected of it. In supervised learning, the training dataset is composed of paired sets of images from the exact same field of view but acquired in both modalities representing the source and target of the transformation. For instance, in a denoising task, the source is a noisy image and the target is the low noise equivalent image. In unsupervised training, the training dataset can simply be made of examples of data that will be fed to the network when performing the task. Often times, the training dataset determines the type task that the network will perform.
Validation dataset This dataset is a small subset of the training dataset (typically 10-15% of the training dataset) which is used to evaluate the performance of the network after each epoch. The network never "sees" the validation set during training and it is therefore useful to assess how well the network generalises to unseen data.