Preprocessing - TobiasSchmidtDE/DeepL-MedicalImaging GitHub Wiki
Resizing
Resizing is done with default configuration of PIL Images. Default downscale strategy is NEAREST
:
"PIL.Image.NEAREST: Pick one nearest pixel from the input image. Ignore all other input pixels."
Cropping
For cropping images we use [OpenCV's template matching algorithm].(https://docs.opencv.org/2.4/doc/tutorials/imgproc/histograms/template_matching/template_matching.html)
Before cropping the image we rescale it to 110% of the image dimension the network expects. This way the templates we needed to specifiy for the algorithm to detect frontal and lateral lungs match best the images that are supposed to be cropped.
Augmentations
We tried out two different kinds of augmentations: Affine and Color augmentations.
Color augmentations randomly changed the brightness, contrast, saturation and hue of an image.
Affine augmentations randomly rotated, scaled and translated the image.
Transformations
Image normalization can improve model generalization and training speed. The xrays we have are all grayscale images. Therefore we will be mainly concered with the brightness and contrast of the images. We wanted to:
- reduce noise (singal spikes due to black background or white text overlay)
- increase visibility of details
For image normalization and transformations we implemented:
- Windowing / Intensity Rescaling
- Windowing is the process of selecting some segment of the total pixel value range (the wide dynamic range of the receptors) and then displaying the pixel values within that segment over the full brightness (shades of gray) range from white to black. This allows us to increase the contrast in a pre-determined value range
- Gausian Smoothing (Blur)
- Median Filter (Blur)
- Sharpening / Unsharp masking / Blurred Mask Subtracation
- Unsharp masking is a linear image processing technique which sharpens the image. The sharp details are identified as a difference between the original image and its blurred version. These details are then scaled, and added back to the original image
- The blurring step could use any image filter method, e.g. median filter, but traditionally a gaussian filter is used. The radius parameter in the unsharp masking filter refers to the sigma parameter of the gaussian filter.
- Historgram Equalization
- Histogram equalization is a method in image processing of contrast adjustment using the image's histogram. Histogram equalization is one of the best methods for image enhancement. It provides better quality of images without loss of any information.
Uncertainty Encoding & NaN Values
How we handle Uncertainty Encoding and NaN Values is described in a separate wiki page [here](Uncertainty Encoding)
Upsampling
To overcome the class imbalance we also tried to upsample the dataset. For this we simply took the top 4 most underrepresented classes and duplicated the samples in which these classes were labeled. Note: This should only be done when data augmentations are also used at the same time to make sure, we don't just overfit on these duplicated samples.