Image Processing Tutorial Using OpenCV (Extension) - ECE-180D-WS-2023/Knowledge-Base-Wiki GitHub Wiki
Image Processing Tutorial Using OpenCV
Jaden Booher & Zeid Solh
Introduction
Over the last decade, machine/deep learning and artificial intelligence have significantly impacted technology development. However, with the consistent growth of computer vision, and the capability for computers to identify and process an image, there is an ever-expanding amount of use cases. Image processing is at the core of these rapidly growing technologies, from IoT to self-driving to facial recognition and biometrics.
What is image processing?
Image processing is “the process of transforming an image into a digital form and performing certain operations to get some useful information from it” (Simplilearn). The purpose is to transform the original image into a new form that software can digest and use. Image processing examples include, but are not limited to, visualization (finding an object not visible in the image), sharpening and restoration (enhancing the picture), recognition (detecting an object in a photo), and retrieval (reverse search a database based on the image). In this tutorial, we will explain some basic image processing that we can do using Python’s OpenCV library.
Tutorials
Tutorial 1: Changing Color Spaces
One of the most basic forms of image processing is changing color spaces. There are approximately 150 different color spaces included in the OpenCV library. This tutorial will focus on converting from BGR, similar to RGB but with an inverse subpixel arrangement, to grayscale, also known as black and white. Many image processing software and algorithms require greyscale, as it removes unnecessary complexities to allow for more accurate and efficient processing.
Let’s try the following code:
import cv2
image = cv2.imread('cookies.jpg') # import image
cv2.imshow('Original',image) # display original image
grayscale = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) # swap colorspace
cv2.imshow('Grayscale', grayscale) # display grayscale image
As expected, we were able to convert the colored image to greyscale:
Tutorial 2: Histogram of Color Channels
Another valuable feature of OpenCV is the color channel histogram function. A histogram of color channels is a way to visualize the color distributions across different intensity levels. It works by creating a graph that shows the frequency of occurrence of different color values in an image. Color channel histograms can be used in image processing to analyze pixel value distribution, adjust brightness and contrast, segment regions or objects, and recognize objects based on color or texture. They provide valuable insights into the composition of images and can help develop more effective image processing algorithms.
Here, we will show how to create a color histogram using OpenCV and the matplotlib library:
import cv2
import numpy as np
from matplotlib import pyplot as plt
img = cv2.imread('house.jpg') # import image
color = ('b','g','r') # define the 3 plots for blue, green, and red
# code for plotting each pixel
for i,col in enumerate(color):
histr = cv2.calcHist([img],[i],None,[256],[0,256])
plt.plot(histr,color = col)
plt.xlim([0,256])
plt.show() # display hisogram
cv2.imshow('Original',image) # display original image
As you can see below, the color channel histogram uses colored lines to graph the number of occurrences of each pixel based on their color values and the number of appearances:
Tutorial 3: Image Smoothing
A third way to use OpenCV for image processing is image smoothing, also known as image blurring, a technique used to reduce noise and sharpness in an image. We remove the noise because it often provides no valuable information and can sometimes lead to image processing and storage issues. Overall, image smoothing aims to make pictures more visually appealing and easier to analyze by removing unwanted details and emphasizing essential features.
For example, the picture of the ducklings below has excessive noise due to the complexities of their feathers sticking out. Here, we will apply two different filters to the image of the ducklings below. These filters work by averaging the pixel values in the neighborhood of each pixel, resulting in a smoother appearance with fewer high-frequency components. Different-sized kernels will result in different extents of blurring.
Here, we will use OpenCV and matplotlib to apply a 5x5 filter, which will select the middle pixel and average it with the pixels all around it in a 5x5 pixel area:
import cv2
import numpy as np
from matplotlib import pyplot as plt
img = cv2.imread('ducklings.png') # import image
kernel = np.ones((5,5),np.float32)/25 # create a kernel to filter with
dst = cv2.filter2D(img,-1,kernel) # convolve the kernel with the image
# display the images
plt.subplot(121),plt.imshow(img),plt.title('Original')
plt.xticks([]), plt.yticks([])
plt.subplot(122),plt.imshow(dst),plt.title('An image smoothed with gaussian smoothing')
plt.xticks([]), plt.yticks([])
plt.show()
Here, you can see how the image on the right is blurred but is still recognizable:
Here we apply the smoothing again but with a 7x7 filter. Notice how it is more blurred than the 5x5:
Tutorial 4: Interactive Foreground Extraction
Using interactive foreground extraction will remove the focus of an image using the GrabCut algorithm, even if you don’t know how to use Photoshop. This algorithm draws a rectangle around the foreground region, then iterates and segments the image until it produces a final, focused image. One of the fascinating aspects of this algorithm is that we can fine-tune it using a “mask,” which allows the user to specify the foreground and background directly. Additionally, this algorithm leverages “strokes,” which enables the user to tell the algorithm if it correctly identified the foreground and background after each iteration to fine-tune the results further.
import numpy as np
import cv2
img = cv2.imread('person.jpg') # import image
# define mask; this can be further tuned
mask = np.zeros(img.shape[:2],np.uint8)
bgdModel = np.zeros((1,65),np.float64)
fgdModel = np.zeros((1,65),np.float64)
rect = (150,50,500,470) # define rectangle based on pixel locations
# apply grabCut method to extract the foreground
cv2.grabCut(img,mask,rect,bgdModel,fgdModel,20,cv2.GC_INIT_WITH_RECT)
mask2 = np.where((mask==2)|(mask==0),0,1).astype('uint8')
img = img*mask2[:,:,np.newaxis]
cv2.imshow('Foreground Image',img) # display the resulting image
Here, you can see how the image separates the girl from the background:
Though there are a couple of issues with her hair, changing the mask will make the foreground extraction more accurate. Here, you can see that the random snippets of her hair are now gone:
Tutorial 5: Tracking a colored object
Tracking a blue ball or any other colored object might initially seem complex, but it becomes straightforward with OpenCV. We can push our capabilities even further by transitioning from analyzing a single photo to processing a live video feed. OpenCV considers a video feed a continuous sequence of images and processes each image as the previous tutorials do.
The first step is to convert the colorspace from BGR to HSV (Hue, Saturation, Value), as color masking only works with HSV. The next step is to define the range of your chosen color in HSV. I chose the blue color in this example. We then create a binary mask, where any pixel in the input image whose values fall within the specified color range is set to white and black if not. We can now pass our mask to OpenCV’s findContours function, which takes in a mask and returns the x, and y coordinates of the blue object present. The final step consists of using the max values of the contour to create a box around our object using OpenCV’s rectangle function.
import numpy as np
import cv2
cap = cv2.VideoCapture(0)
while True:
_, frame = cap.read()
# Convert BGR to HSV
hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
# define range of blue color in HSV
lower_blue = np.array([100,50,50])
upper_blue = np.array([130,255,255])
# Threshold the HSV image to get only blue colors
mask = cv2.inRange (hsv, lower_blue, upper_blue)
bluecnts = cv2.findContours(mask.copy(),
cv2.RETR_EXTERNAL,
cv2.CHAIN_APPROX_SIMPLE)[-2]
if len(bluecnts)>0:
blue_area = max(bluecnts, key=cv2.contourArea)
(xg,yg,wg,hg) = cv2.boundingRect(blue_area)
cv2.rectangle(frame,(xg,yg),(xg+wg, yg+hg),(0,255,0),2)
cv2.imshow('frame',frame)
if cv2.waitKey(10) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
Here, you can see how the code correctly identified and tracked the blue foam roller.
Tracking objects by colors is an essential task in computer vision and image processing as it allows us to track objects automatically based on their color characteristics in a computationally efficient way. For instance, this technique is excellent for the localization aspect of the class project, as players can wear different colored shirts and be easily distinguished. There is no need for complex algorithms to track a person’s face using features, and it is robust against changes in lighting.
Tutorial 6: Tracking a face
We go to the more computationally heavy task of tracking a person’s face, which is highly relevant in various technologies today, including facial recognition systems, augmented reality applications, and video surveillance systems.
The code below first loads the Haar Cascade classifier file for frontal face detection. Positive images (images with faces) and negative images (images without faces) train the classifier, whose algorithm detects the most compelling features that distinguish between faces and non-faces. We then convert each frame to grayscale, as in tutorial 1, apply the face detection algorithm, and draw a green rectangle around a face if it is detected. Converting first the image to grayscale is essential as it is computationally less expensive to process grayscale images, and color information is not important for detecting faces.
There is also some additional commented-out code that similarly tracks the eyes.
import cv2
# Enable camera
cap = cv2.VideoCapture(0)
cap.set(3, 640)
cap.set(4, 420)
# import cascade file for facial recognition
faceCascade = cv2.CascadeClassifier(cv2.data.haarcascades + "haarcascade_frontalface_default.xml")
'''
# if you want to detect any object for example eyes, use one more layer of classifier as below:
eyeCascade = cv2.CascadeClassifier(cv2.data.haarcascades + "haarcascade_eye_tree_eyeglasses.xml")
'''
while True:
success, img = cap.read()
imgGray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Getting corners around the face
faces = faceCascade.detectMultiScale(imgGray, 1.3, 5) # 1.3 = scale factor, 5 = minimum neighbor
# drawing bounding box around face
for (x, y, w, h) in faces:
img = cv2.rectangle(img, (x, y), (x + w, y + h), (0, 255, 0), 3)
'''
# detecting eyes
eyes = eyeCascade.detectMultiScale(imgGray)
# drawing bounding box for eyes
for (ex, ey, ew, eh) in eyes:
img = cv2.rectangle(img, (ex, ey), (ex+ew, ey+eh), (255, 0, 0), 3)
'''
cv2.imshow('face_detect', img)
if cv2.waitKey(10) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyWindow('face_detect')
In the picture below, we see the software correctly tracking my face.
Detecting faces has become crucial for several reasons. First, it plays a significant role in security. CCTVs can now accurately detect human presence instead of relying solely on motion detection, which could mistake a dog’s movement as unusual activity and trigger a false security alert. Second, in biometric authentication systems, face ID verification provides convenience for unlocking phones and authenticating purchases. Third and last, face detection improves our user experience with different apps that employ it combined with artificial intelligence and machine learning. For instance, apps such as Photos and Facebook use it to automatically detect which persons are present in a picture, making it easier for users to find photos or videos with certain people in them.
Conclusion
Image processing is a powerful tool for manipulating and analyzing digital images. This tutorial taught us various techniques, from changing color spaces and smoothing images to tracking faces and objects by colors.
Changing color spaces allows us to change the color space of an image to remove unnecessary complexities to allow for more accurate and efficient processing. Analyzing the histograms of color channels is essential for identifying pixel value distribution, adjusting brightness and contrast, and recognizing images based on color and texture. Image smoothing or blurring is critical for removing noise, improving image quality, and lowering the storage size of an image. Interactive foreground extraction allows us to extract specific objects or regions from an image to isolate them for further analysis. Object tracking analyzes successive video frames to detect and track colored things, providing bounding boxes around them for precise localization. Face detection algorithms utilize facial landmark recognition to accurately identify and differentiate human faces in images or video frames. These tutorials provide valuable insights into understanding image-processing principles and techniques.
These techniques are just a tiny sample of the many tools available in image processing. Under the OpenCV library, and with further exploration, one can create stunning and informative images that can be used in a wide range of applications and IoT.
WC: 1151