Preprocessing camera photos - lmmx/devnotes GitHub Wiki
I sometimes use my camera to take photos as input for data processing, and imagemagick is a convenient way to preprocess them to improve the image quality. Some examples are collected below.
Levels
To auto-level an image, use this
convert test.jpg \( +clone -resize 1x1! -scale $wxh! -negate \) \
-define compose:args=50,50 -compose blend -composite -auto-level result.jpg
- E.g. if your image is slightly underexposed,
-auto-level
will brighten it - On a test image at 1024x960 px it takes 0.375 seconds
White balance
- Auto-white balance is more powerful than this, and recommended instead
- GIMP has a batched mode which you can run on the command line for a directory
of images, see code and comments
- The docs say it removes "0.05%" of the pixels either end of each R,G,B channel, stretching the remaining histogram (docs)
- Based on this alternative I suspect that the "0.05%" is actually a typo and it's 5% total, 2.5% either end.
It would be preferable to do this in a scripting language like Python (or Julia/Go) and be able to control the parameters
conda create -y -n img_utils
conda activate img_utils
conda install -y python scikit-image
import skimage
from skimage import img_as_float
from skimage.io import imread, imshow, show, imsave
from pathlib import Path
img_file_path = Path("test.jpg")
img_out_path = img_file_path.parent / (img_file_path.stem + "_adjusted" + img_file_path.suffix)
img = skimage.io.imread(img_file_path)
R, G, B = (img[:,:,c] for c in range(3))
Essentially, stretching in the way described just involves setting anything in the 0-5% or 95-100% quantiles to 5% or 95% respectively, and then normalising the image.
This can be achieved by:
- Calculating the inferior and superior percentiles: p0 and p1
- Clipping to the percentile range [p0,p1]
- Scaling up so that these become the new min and max
This can be done with np.percentile
(with values of 5 and 95) passed into
skimage.exposure.rescale_intensity
.
percentile = 5
p0 = percentile / 2
p1 = 100 - p0
img_adj = np.dstack([
skimage.exposure.rescale_intensity(
img[:,:,c],
in_range=tuple(np.percentile(img[:,:,c], (p0, p1)).tolist())
)
for c in range(3) # 3 channel RGB
])
imsave(img_out_path, img_adj)
- This looks identical to the one calculated by GIMP, and obviously is faster if you first resize the image to a smaller size.
- If you set
percentile = 0
then the white balance will be 'non-destructive', i.e. will preserve information at the far ends of each channel (which may in fact be important for computer vision applications)- This is simply the min and max of the image, so you can replace the tuple from
np.percentile
with the string'image'
and scikit-image will handle it for you. In fact, this is the default, so you can remove thein_range
argument from the call torescale_intensity
altogether.
- This is simply the min and max of the image, so you can replace the tuple from
Here's an example I made to preprocess book photos for document image dewarping, without the percentile clipping, only white balancing. After this, it also pads both sides with black pixels to make a 1024px tall by 960px wide vertical image.
import skimage
import numpy as np
from skimage import img_as_ubyte
from skimage.io import imread, imshow, show, imsave
from skimage.transform import rotate, rescale, resize
from skimage.exposure import rescale_intensity
from pathlib import Path
img_file_path = Path("test.jpg")
img_out_path = img_file_path.parent / (
img_file_path.stem + "_adjusted" + img_file_path.suffix
)
img = imread(img_file_path)
is_vert = img.shape[0] > img.shape[1]
long_edge_len = img.shape[1 - int(is_vert)]
to_shape = (1024, 960, 3)
downscale = to_shape[0] / long_edge_len
img = img_as_ubyte(rescale(img, downscale, multichannel=True, anti_aliasing=True))
# Rotate after downscaling, it's faster
if img.shape[0] < img.shape[1]:
img = rotate(img, 90) # EXIF is unreliable, fix later if wrong
img_adj = np.dstack([rescale_intensity(img[:, :, c]) for c in range(3)])
pad_cols = to_shape[1] - img_adj.shape[1]
pad_arr_l = np.zeros((to_shape[0], pad_cols // 2, to_shape[2]), dtype=img_adj.dtype)
pad_arr_r = np.zeros((to_shape[0], pad_cols - pad_arr_l.shape[1], to_shape[2]), dtype=img_adj.dtype)
img_adj = np.hstack([pad_arr_l, img_adj, pad_arr_r]) # pad all black on either side
imsave(img_out_path, img_adj)