Future Work - TobiasSchmidtDE/DeepL-MedicalImaging GitHub Wiki

While working on the project, we had some ideas on possible improvements and new approaches:

Leverage full patient x-ray history: usually there are multiple x-rays for the same patient, we could treat this as a sequence of data and gain information from the aspect of time/sequence
Introduce meta information like sex or age as additional features
Handle noisy labels as proposed in "Epoch-Wise Label Attacks for Robustness Against Label Noise"
Image Size Ensembling: while experimenting with the image size, we have seen that different images sizes work better/worse for different pathologies. Our intuition from the radiology lecture and learning how to diagnose is that different pathologies are treated at different scales. While tumors can easily be spotted even at low resolution, diagnosing anomalies in the pulmonary vessels requires a higher resolution. Additionally, with rather 'obvious' pathologies, a higher resolution could even introduce noise and the performance of the model will decrease when learning with high resolution images. We propose splitting the dataset into two subset depending on what resolution works better and training two seperate models for their predictions.
More Ensembling in general: Obviously just putting more time and effor into ensembling could improve the classification scores
Use different/multiple dataset: Probably the quickest win could be gained from training on a combination of the Mimic and Chexpert dataset. Both were labled using the same NLP auto-labeler and therefore should work together nicely.
Try out label smoothing for uncertainty labels