Uncertainty Encoding - TobiasSchmidtDE/DeepL-MedicalImaging GitHub Wiki

First, we take a look at how the labels in the dataset are designed:
positive label (1): observations with at least one positively classified mention in the report
uncertain label (u): no positively classified mention and at least one uncertain mention
negative label (0): observations with at least one negatively classified mention in the report
NaN: no mention of the observation

Strategies

Ignoring/U-Ignore

ignore all uncertain labels during training
but: reduces the effective size of the dataset

Binary Mapping

U-Zeros: map all u-labels to 0
U-Ones: map all u-labels to 1

Class-based Uncertainty Encoding

instead of using one binary mapping for all classes, a class-based approach can be used
it has been stated in literature that different classes work better with different uncertainty encodings
all papers we could find only focus on the 5 best-performing classes, so we do not have any predetermined information on what encodings work best for all of the 12 classes
(the only paper)[ https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=8719904] that reports results for all 12 classes uses the UIgnore approach, an approach that we have outruled simply because we will lose a lot of data

We run a comparison experiment and determine the best uncertainty encodings ourselves:

Pathology	UZeroes	UOnes
En. Cardiomegastinum	0.529	0.520
Cardiomegaly	0.782	0.762
Lung Opacity	0.905	0.858
Lung Lesion	0.761	0.824
Edema	0.901	0.891
Consolidation	0.935	0.901
Pneumonia	0.742	0.765
Atelectasis	0.807	0.768
Pneumothorax	0.692	0.795
Pleural Effusion	0.924	0.927
Pleural Other	0.984	0.877

How to handle NaN values

A problem we encountered with the dataset were NaN values in the training set. NaN-labels mean that the NLP labeler could not find any mention of that in the report. We decide to encode this as 0, as the disease would have been specifically mentioned in the report if it would be present. In this case we calculate the BCE for all classes and average it.
We have also implemented and tested a masked loss function, where the BCE is calculated only for the classes where we have no-NaN labels, but did not find any improvements in using this custom loss function.