Data labeling Specification - RecycleAI/RecycleIT-A GitHub Wiki
Recycling is vital for a sustainable and clean environment. We use just 481.6 billion plastic bottles every year and only about 9% of them are recycled. In this project, we’ll look at how computer vision can be used to identify different types of plastics, glass, and metal from Household waste. Initially we should look at the dataset and label them.
- The dataset spans three main classes: glass, plastic, and metal
- Subclasses should be specified according to the type and color of each main class
Note: plastics have 7 types: PET, HDPE, PVC, LDPE, PP, PS, and Others.
How to label data ?
Before diving into the labeling system, we should take a look at the dataset and its requirements:
- We need to have deformed and broken objects in our dataset in addition to dirty ones.
- Images should be taken from different angles in case they have been deformed or broken.
- We should regulate brightness and contrast.
- To enhance the dataset, using the Augmentation method is a must.
- It should be considered that all the data have the same size and resolution.
- The pictures should be taken by placing the object on a dark background.
- To avoid confusing objects with bright background, we can use one or some of the following methods:
- 7.1. Create a tight bounding box for every object
- 7.2. Omit background
- 7.3. Mask object
- 7.4. Edge Detection
- The initial form of data is an image, but it should be converted to other required formats concerning the detection method.
- We should collect a balanced and diverse dataset for each category.
There are two common methods for labeling:
- Manual
- Automatic
Manual labeling system
Here we utilize Label Studio (https://labelstud.io/) which is one of the most flexible data annotation tools for labeling and exploring multiple types of data. we enjoy performing different types of labeling with many data formats.
Let's see how we can label our dataset:
- Install label studio (https://labelstud.io/guide/install.html)
- Start Label Studio with the label-studio command.
- Sign up with an email address and password that you create.
- Click to create a project and start labeling data.
- Click Data Import and upload the data files that you want to use.
- Click Labeling Setup and choose a template and customize the label names for your use case.
There are different options for annotation recommended on this site including Semantic Segmentation with Polygon, masking, bounding boxes, and so on.
- Click Save to save your project.
According to the above picture, some of the icons are not available. It means that we should consider our project and requirements and then choose the appropriate labeling method.
Here we annotate a bottle with Polygon annotation:
To know how to use this labeling system, take a look at this guide: https://www.youtube.com/watch?v=UUP_omOSKuc
Finally, we take a quick look at some formats in the Label Studio:
- COCO (image segmentation, object detection) [x_min, y_min, width, height]
Popular machine learning format used by the COCO dataset for object detection and image segmentation tasks with polygons and rectangles.
- Pascal VOC XML (image segmentation, object detection) [x_min, y_min, x_max, y_max]
Popular XML format used for object detection and polygon image segmentation tasks.
- YOLO format (image segmentation, object detection) [x_center, y_center, width, height]
Popular TXT format is created for each image file. Each txt file contains annotations for the corresponding image file, that is object class, object coordinates, height & width.
- Brush labels to NumPy (image segmentation)
Export your brush labels as NumPy 2d arrays. Each label outputs as one image.
Apart from this method, we prioritize automatic labeling, because as it is blindingly clear for large datasets the above method is time-consuming and expensive, so it is advisable to establish and develop methods for the automatic sorting of data.
Automatic labeling
The most difficult challenge in the waste management is complex, and not unified guidelines regarding segregation rules. However, in recent years machine learning (ML) based systems that can support or fully cover sorting processes were implemented, accelerating this procedure as a result. In this project, one of the state-of-the-art deep learning (DL) models are implemented to assign proper class based on the photo, and the garbage is moved to the appropriate bottom container.
There are different detection models, but our aim is to find an extremely precise model with 98% accuracy for Pet and 95% for other groups and in this regard, without a doubt, Yolov5 families stand out from other detection models.
Here is a plot to make a clear comparison between Yolov5 and EfficientDet in terms of speed and accuracy
How Automatic Labeling works?
Anyone who has worked with object detection knows that the labeling/annotation process is the hardest part. It is not difficult because it's complex like training a model, but because the process is very tedious and time-consuming. Considering this annoying bottleneck, We’ve created a simple (yet effective) auto annotation tool with hub as well as detect.py yolov5 auto annotation tool with detect.py to make this process easier. Although it doesn’t completely replace the manual annotation process, it help us to save a lot of time.
The auto annotation tool is based on the idea of a semi-supervised architecture, where a model trained with a small amount of labeled data is used to produce the new labels for the rest of the dataset. As simple as that, the library uses an initial and simplified object detection model to generate the Txt files with the image annotations (considering the Yolo format(class_id, x_centre, y_centre, width, height)). This process can be illustrated by the following image:
The process of checking the label and box for each image is done on roboflow site.
Train on initial dataset
The first dataset contains 4817 images from the Drinking Waste Classification of the Kaggle site with four classes (Alu Can, Glass, HDPE, and PET). In the training process, yolov5 augmentations are applied to data to increase data and improve detection. After 52 epochs on 3854 training images and 1% background images(80% of data), we achieve the following results on 963 validation data:
The above result obtains from pre-train weights of yolov5 models on the coco dataset with 28 classes.
Another experiment has also been conducted on this dataset under the same condition except for weights of deep plastic with one class used for transfer learning.
Following that, the weights from the above training are used to detect unseen data collected from the telegram bot. All the images went through different augmentations, but we do not change color because it is an important item to help identify images.
Some of the results are as follow:
Models Comparison
- Number of detected images
This shows considerable growth in the two last models.
- Additionally, the below table illustrates the comparison between models in terms of the percentage of IOU measures and correct detection for 7 images selected randomly