Preprocessing camera photos - lmmx/devnotes GitHub Wiki

I sometimes use my camera to take photos as input for data processing, and imagemagick is a convenient way to preprocess them to improve the image quality. Some examples are collected below.

Levels

To auto-level an image, use this

convert test.jpg \( +clone -resize 1x1! -scale $wxh! -negate \) \
-define compose:args=50,50 -compose blend -composite -auto-level result.jpg
  • E.g. if your image is slightly underexposed, -auto-level will brighten it
  • On a test image at 1024x960 px it takes 0.375 seconds

White balance

  • Auto-white balance is more powerful than this, and recommended instead
  • GIMP has a batched mode which you can run on the command line for a directory of images, see code and comments
    • The docs say it removes "0.05%" of the pixels either end of each R,G,B channel, stretching the remaining histogram (docs)
    • Based on this alternative I suspect that the "0.05%" is actually a typo and it's 5% total, 2.5% either end.

It would be preferable to do this in a scripting language like Python (or Julia/Go) and be able to control the parameters

conda create -y -n img_utils
conda activate img_utils
conda install -y python scikit-image
import skimage
from skimage import img_as_float
from skimage.io import imread, imshow, show, imsave
from pathlib import Path
img_file_path = Path("test.jpg")
img_out_path = img_file_path.parent / (img_file_path.stem + "_adjusted" + img_file_path.suffix)
img = skimage.io.imread(img_file_path)
R, G, B = (img[:,:,c] for c in range(3))

Essentially, stretching in the way described just involves setting anything in the 0-5% or 95-100% quantiles to 5% or 95% respectively, and then normalising the image.

This can be achieved by:

  • Calculating the inferior and superior percentiles: p0 and p1
  • Clipping to the percentile range [p0,p1]
  • Scaling up so that these become the new min and max

This can be done with np.percentile (with values of 5 and 95) passed into skimage.exposure.rescale_intensity.

percentile = 5
p0 = percentile / 2
p1 = 100 - p0

img_adj = np.dstack([
    skimage.exposure.rescale_intensity(
        img[:,:,c],
        in_range=tuple(np.percentile(img[:,:,c], (p0, p1)).tolist())
    )
    for c in range(3) # 3 channel RGB
])
imsave(img_out_path, img_adj)
  • This looks identical to the one calculated by GIMP, and obviously is faster if you first resize the image to a smaller size.
  • If you set percentile = 0 then the white balance will be 'non-destructive', i.e. will preserve information at the far ends of each channel (which may in fact be important for computer vision applications)
    • This is simply the min and max of the image, so you can replace the tuple from np.percentile with the string 'image' and scikit-image will handle it for you. In fact, this is the default, so you can remove the in_range argument from the call to rescale_intensity altogether.

Here's an example I made to preprocess book photos for document image dewarping, without the percentile clipping, only white balancing. After this, it also pads both sides with black pixels to make a 1024px tall by 960px wide vertical image.

import skimage
import numpy as np
from skimage import img_as_ubyte
from skimage.io import imread, imshow, show, imsave
from skimage.transform import rotate, rescale, resize
from skimage.exposure import rescale_intensity
from pathlib import Path

img_file_path = Path("test.jpg")
img_out_path = img_file_path.parent / (
    img_file_path.stem + "_adjusted" + img_file_path.suffix
)

img = imread(img_file_path)

is_vert = img.shape[0] > img.shape[1]
long_edge_len = img.shape[1 - int(is_vert)]

to_shape = (1024, 960, 3)
downscale = to_shape[0] / long_edge_len
img = img_as_ubyte(rescale(img, downscale, multichannel=True, anti_aliasing=True))

# Rotate after downscaling, it's faster
if img.shape[0] < img.shape[1]:
    img = rotate(img, 90) # EXIF is unreliable, fix later if wrong

img_adj = np.dstack([rescale_intensity(img[:, :, c]) for c in range(3)])

pad_cols = to_shape[1] - img_adj.shape[1]
pad_arr_l = np.zeros((to_shape[0], pad_cols // 2, to_shape[2]), dtype=img_adj.dtype)
pad_arr_r = np.zeros((to_shape[0], pad_cols - pad_arr_l.shape[1], to_shape[2]), dtype=img_adj.dtype)
img_adj = np.hstack([pad_arr_l, img_adj, pad_arr_r]) # pad all black on either side

imsave(img_out_path, img_adj)