Contour box for video WIKI - ECE-180D-WS-2023/Knowledge-Base-Wiki GitHub Wiki

Contour box for video WIKI

Have you ever seen security footage like the one below?

Notice how the surveillance camera can recognize the human faces in the picture and draw a contour box outside each detected face. In some cases, the camera is even able to label the face as a VIP or blacklisted. Here, the surveillance system is performing an object detection and image segmentation task. It is detecting the faces within the camera footage, boxing the detected faces, and isolating them from the frame. Such tasks are typical objectives in modern machine vision research. Researchers race toward building the most efficient and capable machine vision algorithm. Deep learning models from companies such as Microsoft and Google have already surpassed human-level performance, trained and tested on large datasets such as ImageNet.

Image detection is an indispensable function of many modern technological innovations. It can teach self-driving cars to discern between road signs or help healthcare workers detect tumors from CT scans. Furthermore, machine vision technology can be integrated into modern IOT systems, creating new possibilities in data collection, process automation, robotics, and many other fields.

In this tutorial, you will learn how to detect and draw a contour box outside of a mono-colored object during a live stream captured by your laptop camera.

Software and materials required

  • Working laptop with webcam.
  • Anaconda installed (if you don't have anaconda installed, please visit their official website for download instructions.

Let's get right into it!

Step 1: Setting up the environment & Installing OpenCV2

Before we start the project, it is a good practice to create a virtual environment for the project. If you have never heard of a virtual environment before, A virtual environment is a directory that contains a specific collection of packages that you have installed. By creating an environment for each of your projects, you avoid the troubles that may arise from different installed packages having conflicting dependencies.

To create a new environment in Anaconda, open up "Anaconda power shell prompt" (The fastest way to find this is to search it in your windows search bar).

image

In the power shell prompt, type in conda create --name YOURNAME.

envcrea

In the code above, I called my environment "testenv".

You have now created an environment. Activate it by typing conda activate YOURNAME.

envact

Notice that the (base) in front of the directory now reads (testenv). You are operating in the environment you have just created.

Let's install the necessary packages for the project!

For this project, You will need OpenCV. What is OpenCV?

From their official website:

"OpenCV (Open Source Computer Vision Library) is an open source computer vision and machine learning software library. OpenCV was built to provide a common infrastructure for computer vision applications and to accelerate the use of machine perception in the commercial products."

OpenCV is an open-source library with a repertory of useful functions and filters for image processing and machine vision, a few of which will be used in this project. Let's first install it.

In the environment you have just created, type in conda install -c menpo opencv.

image

You will see the installation information. You can watch the progress bars as OpenCV is being downloaded. At one point you will be prompted:

image

Naturally, type 'y' to finish the installation.

Step 2: Changing the color space of a frame & Selecting a color to capture

Before we jump into a video stream, let's begin by thinking about one frame in the video, or a picture.

The very first step in recognizing the mono-colored object is to transform the color space of the picture. You can do this by using the cv.cvtColor function in OpenCV.

hsv_picture = cv.cvtColor(picture, cv.COLOR_BGR2HSV)

cv.cvtColor() transforms the picture from GBR color space to HSV color space. We do this because OpenCV reads picture in GBR format. The GBR format registers colors by breaking down the color of every single pixel into a combination of red, green and blue colors. However, this format does not facilitate the separation of a particular color very well. Color separation is more effectively performed in the HSV color space. HSV stands for "Hue", "Saturation" and "Value". Here is quote from quora:

"'Hue' represents the color, 'Saturation' represents the amount to which that respective color is mixed with white and 'Value' represents the amount to which that respective color is mixed with black (Gray level)."

In the HSV space, color is represented by the "Hue" value. "Saturation" and "Value" of a color are influenced by the lighting conditions and reflecting properties of the objects in the picture. Below is a list of useful "Hue" ranges for typical colors (source):

  • Orange 0-22
  • Yellow 22- 38
  • Green 38-75
  • Blue 75-130
  • Violet 130-160
  • Red 160-179

Let's define a range of HSV values that will include the color of the object we desire to detect:

lower_blue = np.array([80,50,50]) 
upper_blue = np.array([130,255,255])

lower_blue is the lower range of the blue color I want capture, upper_blue is the upper limit to the blue color I want to capture. The values are listed in the order of Hue, Saturation and Value. Using the table above, the Hue value for blue lies somewhere between 75 and 130. Here I used a lower hue limit of 80 and upper hue limit of 130. This numerical range should include most blue colors. As with Saturation and Value, I derived above values by intuition, as well as trial and error. Trial and error are probably the best way to find a desirable range for the color you want to capture.

Step 3: Masking the image & Finding contours

Let's experiment with a single picture:

I want to draw a contour box around the blue flower in the picture. We should first begin with the steps introduced in step 1.

# read in image
blu = cv2.imread('blu.jpeg', 1)
# change color domain from BGR to HSV
hsv = cv2.cvtColor(blu, cv2.COLOR_BGR2HSV)

I have named the picture of blue flower 'blu.jpeg'. This picture should be placed in the same directory (folder) with your python (.py or .ipynb) file. cv2.imread reads in an image file into a BGR format, therefore in the second line I am converting it into HSV space.

I define my color range as:

# define blue color
lower_blue = np.array([110,50,50])
upper_blue = np.array([130,255,255])

Let's filter the image to include only our desired object:

mask = cv.inRange(hsv, lower_blue, upper_blue)

cv.inrange(src, lowerb, upperb) accepts 3 arguments. It searches through each element in src. If the element has a value that is between the lower and upper boundaries (lowerb and upperb), the value of that element is pulled up to 255. All other entries in src are set to 0.

We should inspect our filtered image. We can create graphs in OpenCV with the following syntax:

cv2.imshow('mask', mask)
cv2.waitKey(0)
cv2.destroyAllWindows()

cv2.imshow(winname, mat) displays an image stored in mat in a new window with name winname. This function always assumes BGR format.

cv2.waitKey(0) displays the window infinitely until a key is pressed.

cv2.destroyAllWindows() then destroys the window.

cv2.waitkey() and cv2.destroyAllWindows() should always accompany cv2.imshow(). Otherwise you will get some trouble with the display window and python might stop responding.

Here is the output image:

As you can see, the inrange() function has filtered the original image. The pixels with HSV values that was between the defined upper and lower bounds has been given a new value of 255. In GBR, 255 indicates maximum light, which would be white. The rest of the pixels in the image has been given a new value of 0. 0 indicates minimum light, which would be black. We call this image a binary mask since every value in the mask can only take on either of 2 values (0 or 255), making it binary.

The binary mask we have created makes it much easier for algorithms to detect contours within an image. Find contours with the functions below:

contours, hierarchy = cv2.findContours(image, mode, method)

cv.findContours searches for all the contours within an image. The first output contours is a Python list of all contours found in the image. Each individual contour is a Numpy array of the (x, y) coordinates that forms the boundary points of the contour. The second output hierarchy contains information about the next, previous, parent and child contours of every contour, thereby enabling an examination of image topology. For our purposes hierarchy can be ignored.

The function takes in 3 inputs. image is the binary mask you have just created. mode and method are OpenCV objects that entails how the function should search for contours. We will use mode = cv.RETR_TREE and method = cv.CHAIN_APPROX_SIMPLE. Visit the OpenCV documentation for this function to learn more about these inputs.

I will now use the cv2.findContours on the binary mask we have created:

contours, hierachy = cv2.findContours(mask,cv.RETR_TREE,cv.CHAIN_APPROX_SIMPLE)

We can use the cv2.drawContours() function to draw the calculated contours on a picture:

img_with_contour = cv2.drawContours(img, contours, contourchoice, color, thickness)

Let me explain the inputs. img is the image you want to draw the contours onto, contours is the contour list you have generated. contourchoice lets you choose which contour from the list you want to draw. You can refer to a desired contour by its index number in the list, or -1 to draw all contours. color is a GBR array [G, B, R] of the contour lines you will be drawing, and thickness is an integer value of the thickness of the contour lines.

blu_with_contour = cv2.drawContours(blu, contours, -1, [0,0,255], 2)
cv2.imshow('contour', blu_with_contour)
cv2.waitKey(0)
cv2.destroyAllWindows()

Here are the results:

I drew the contours onto the original image despite acquiring the contour through a binary mask, since the point of generating a binary mask is to make it easier to find the contours on the original image.

Step 4: Choosing the right contour to draw a box around & Drawing the box

We have found all the contours within the image. Now we have to draw a box around the flower in the image.

Recall that our contours object is a python list of all the contour lines.

image

The program has found 16 different contours. But which one is the right one?

We can find out by first drawing all the contour boxes:

for i in contours:
    x,y,w,h = cv2.boundingRect(i)
    blu_with_box = cv2.rectangle(blu,(x,y),(x+w,y+h),(255,255,0),5)

cv2.imshow('box', blu_with_box)
cv2.waitKey(0)
cv2.destroyAllWindows()

cv2.boundingRect(arr) generates the dimensions for the smallest rectangle that would encompass a set of points stored in the array arr. Since an element in the contour list is just a list of points that connects the contour lines. This function would give us the rectangle that would encompass a contoured object.

out = cv2.rectangle(img, pt1, pt2, color, thickness) draws a rectangle on img at the specified dimension outlined by pt1 and pt2. pt1 and pt2 are both set of 2 coordinates locating the rectangle's vertices. cv2.boundingRect(i) outputted x, y as the diagonal vertices of the rectangle, and w, h as the width and height of the rectangle. Therefore, the vertices of the box would be located at (x, y), (x+w, y+h). Color specifies the color of the lines of the box. It assumes a GBR format of [G, B, R]. Thickness is a single integer value indicating the thickness of the line of the rectangle.

For this image, we want to select the contour with the largest area inside. We can find the area inside each contour by a simple function: cv2.contourArea(contours[i]). This function returns the area inside a specific contour item contours[i]. Let's find the largest contour area:

contour_size = np.ones(len(contours))
for i in range(len(contours)):
    contour_size[i] = cv2.contourArea(contours[i])
   
max(contour_size)

I initialized a list of ones with the same length as the contour list. Then I used a for loop to find the contour area of every contour and fill up the newly initialized list with these values. After that I used the max() function to find the contour with the largest area inside it.

Let's try drawing a box only around the selected contour with the largest area:

for i in contours:
    if cv2.contourArea(i) == max(contour_size):
        x,y,w,h = cv2.boundingRect(i)
        blu_with_box = cv2.rectangle(blu,(x,y),(x+w,y+h),(255,255,0),5)

cv2.imshow('box', blu_with_box)
cv2.waitKey(0)
cv2.destroyAllWindows()

Here is the output:

Great! We have successfully found out how to draw a box outside of a desired mono-colored object in an image! The only thing left to do is to extend this technique to a streaming video.

Final Step(Step 5): Appling everything to a video

Let's put everything we've learnt together:

We start by turning on the camera:

cap = cv.VideoCapture(0)

cv.VideoCapture(0) turns on your webcam for recording.

Then, create a while loop. We want to loop indefinitely until we shut down the program.

while(1):
    # enter your codes for contouring and boxing 

The first line within the loop captures a frame from the camera:

_, frame = cap.read()

frame is the picture that is taken by your webcam. Now we can apply every step we did above to the captured frame:

Step 2:

    # Convert BGR to HSV
    hsv = cv.cvtColor(frame, cv.COLOR_BGR2HSV)
    # define range of blue color in HSV
    lower_blue = np.array([80,50,50])
    upper_blue = np.array([130,255,255])

Step 3:

    # Threshold the HSV image to get only blue colors
    mask = cv.inRange(hsv, lower_blue, upper_blue)
    # Bitwise-AND mask and original image
    res = cv.bitwise_and(frame,frame, mask= mask)
    
    # use median filter to filter out the camera noise 
    mask_sm = cv.medianBlur(mask, 13)

    # find contours 
    contours, hierachy = cv.findContours(mask_sm,cv.RETR_TREE,cv.CHAIN_APPROX_SIMPLE)
 
    # draw all contours(for analysis)
    cv.drawContours(frame, contours, -1, 255, 3)
    

Step 4:

    #find largest contour area
    contour_size = np.ones(len(contours))
    for i in range(len(contours)):
        contour_size[i] = cv.contourArea(contours[i])
    
    # graph the largest contour 
    for i in contours:
        if cv.contourArea(i) == max(contour_size):
            x,y,w,h = cv.boundingRect(i)
            cv.rectangle(frame,(x,y),(x+w,y+h),(0,0,255),2)
            cv.rectangle(mask_sm,(x,y),(x+w,y+h),(0,0,255),2)
            
    # put the frame in the window 
    cv.imshow('frame',frame)
    cv.imshow('mask',mask_sm)
    cv.imshow('res',res)

Do notice how I added in a median filter in step3. The median filter is commonly used in image processing to smooth out salt and pepper noise. Due to the quality of your webcam this step might be unnecessary. However, my webcam generated a lot of noise when I was working on this project and using a noise cancelling filter helped prevent noise interfering with the object recognition task.

Here is a clip of the result:

https://user-images.githubusercontent.com/107218842/217465337-28eff54a-dd5d-46f4-934b-62a8b8111def.mp4

As you can see in the video. The findContour() function still picked up noises from the lighting condition of the room as well as the camera. However, choosing the largest contour area allowed us to still track the blue card in my hand.

Conclusions

Congratulations! You have just built a basic machine vision program with capabilities of object detection and image segmentation. Hopefully it wasn't too hard for you to follow along.

Think briefly about the program. What are its drawbacks? Did it successfully detect the color you wanted to trace? Were there any difficulties you encountered that weren't covered in this tutorial?

An apparent flaw of the program is that it can only detect a mono-colored object whose color is different from its background. The program may fail if you were to use it to trace an Area 51 escapee disguised in front of a green screen. How can we improve this? How can we design the program such that an object can be detected under any conditions?

Well, that is a question many brilliant researchers are working tirelessly to solve. The modern solution to machine vision often incorporates deep learning neural networks trained on millions of images to be able to detect and categorize objects within an image. But even machine learning is no more than mathematics. Can our camera truly "see"? Does it really know the difference between a cat and a dog?

Perhaps one day it will. Perhaps you will create the first sentient camera, Perhaps this tutorial inspired you to embark on such an insurmountable journey.

Thanks for reading. I hope you have learned something useful.

Here is the entire code:

#code sources:
#https://code.likeagirl.io/finding-dominant-colour-on-an-image-b4e075f98097
#https://docs.opencv.org/4.x/df/d9d/tutorial_py_colorspaces.html

#improvements:
# invert mask for contouring 
# add in contour tracking
# blur mask to elimminate noise from camera 
# add in threshold for contour area to select correct contour of the desired object 
import cv2 as cv
import numpy as np


cap = cv.VideoCapture(0)

# for recoding purposes
'''
width= int(cap.get(cv.CAP_PROP_FRAME_WIDTH))
height= int(cap.get(cv.CAP_PROP_FRAME_HEIGHT))

writer= cv.VideoWriter('detection.mp4', cv.VideoWriter_fourcc(*'DIVX'), 20, (width,height))
'''

while(1):
    # Take each frame
    _, frame = cap.read()
    # Convert BGR to HSV
    hsv = cv.cvtColor(frame, cv.COLOR_BGR2HSV)
    # define range of blue color in HSV
    lower_blue = np.array([80,50,50])
    upper_blue = np.array([130,255,255])
    # Threshold the HSV image to get only blue colors
    mask = cv.inRange(hsv, lower_blue, upper_blue)
    # Bitwise-AND mask and original image
    res = cv.bitwise_and(frame,frame, mask= mask)
    
    # use median filter to filter out the camera noise 
    mask_sm = cv.medianBlur(mask, 13)

    # find contours 
    contours, hierachy = cv.findContours(mask_sm,cv.RETR_TREE,cv.CHAIN_APPROX_SIMPLE)
 
    # draw all contours(for analysis)
    cv.drawContours(frame, contours, -1, 255, 3)
    
    #find largest contour area
    contour_size = np.ones(len(contours))
    for i in range(len(contours)):
        contour_size[i] = cv.contourArea(contours[i])
    
    # graph the largest contour 
    for i in contours:
        if cv.contourArea(i) == max(contour_size):
            x,y,w,h = cv.boundingRect(i)
            cv.rectangle(frame,(x,y),(x+w,y+h),(0,0,255),2)
            cv.rectangle(mask_sm,(x,y),(x+w,y+h),(0,0,255),2)
            
    # put the frame in the window 
    cv.imshow('frame',frame)
    cv.imshow('mask',mask_sm)
    cv.imshow('res',res)
    
    #writer.write(frame)
    

    k = cv.waitKey(5) & 0xFF
    if k == 27:
        break
cap.release()
writer.release()
cv.destroyAllWindows()

References: