Detection Metrics

1. Intersection Over Union

Intersection over Union (IoU) is a metric that quantifies the degree of overlap between two regions. IoU metric evaluates the correctness of a prediction. The value ranges from 0 to 1. With the help of IoU threshold value, we can decide whether a prediction is True Positive, False Positive, or False Negative.

IoU is useful through tresholding, that is, we need a threshold (α, say) and using this threshold we can decide if a detection is correct or not.

2. Confusion Matrix

In object detection, the correctness of the prediction (TP, FP, or FN) is decided with the help of IoU threshold. Whereas in image segmentation, it is decided by referencing the Ground Truth pixels. Ground truth meaning known objects.

Object detection predictions based on IoU

Image segmentation predictions based on Ground Truth Mask

3. Precision

P = TP/(TP + FP)

4. Recall

R = TP / (TP + FN)

5. Average Precision(AP)

it is the area under the precision-recall curve. APα means that AP precision is evaluated at α IoU theshold. AP is calculated individually for each class.

6. mAP Mean Average Precision or mAP is the average of AP over all detected classes.

mAP = 1/n * sum(AP), where n is the number of classes.

7. Average recall

Average recall (AR) is also a numerical metric that can be used to compare detector performance. In essence, AR is the recall averaged over all IoU ∈ [0.5, 1.0] and can be computed as two times the area under the recall-IoU curve:

AR = 2∫recall(o)do ,where o is IoU

8. Mean average recall

Mean average recall is defined as the mean of AR across all K classes.

9. Inference Time

The inference time is how long is takes for a forward propagation. To get the number of Frames per Second, we divide 1/inference time.

Detection Models

The state-of-the-art object detection methods can be categorized into two main types: One-stage vs. two-stage object detectors. n general, deep learning based object detectors extract features from the input image or video frame.

An object detector solves two subsequent tasks:

Find an arbitrary number of objects (possibly even zero).
Classify every single object and estimate its size with a bounding box.

1. Two-stage detectors

The two-stage architecture involves (1) object region proposal with conventional Computer Vision methods or deep networks, followed by (2) object classification based on features extracted from the proposed region with bounding-box regression.
Two-stage methods achieve the highest detection accuracy but are typically slower. Because of the many inference steps per image, the performance (frames per second) is not as good as one-stage detectors.
Various two-stage detectors include region convolutional neural network (RCNN), with evolutions Faster R-CNN or Mask R-CNN. The latest evolution is the granulated RCNN (G-RCNN)

2. One-stage detectors

One-stage object detectors prioritize inference speed and are super fast but not as good at recognizing irregularly shaped objects or a group of small objects.
The most popular one-stage detectors include the YOLO, SSD, and RetinaNet. The latest real-time detectors are YOLOv4-Scaled (2020) and YOLOR (2021). View the benchmark comparisons below.
The main advantage of single-stage is that those algorithms are generally faster than multi-stage detectors and structurally simpler.

3. Comparison above different types of detection algorithms Frames per second

Accuracy

Models with lowest inference time

Models with hieghest MAP

Comparison of two-stage and one-stage detection models

https://www.researchgate.net/figure/Performance-comparison-of-various-object-detection-models_fig5_349960212

Memory usage

GPU time

Detection Models specifications

Yolov7

by https://github.com/WongKinYiu

Model	Test Size	AP^test	AP₅₀^test	AP₇₅^test	batch 1 fps	batch 32 average time
YOLOv7	640	51.4%	69.7%	55.9%	161 fps	2.8 ms
YOLOv7-X	640	53.1%	71.2%	57.8%	114 fps	4.3 ms

YOLOv7-W6	1280	54.9%	72.6%	60.1%	84 fps	7.6 ms
YOLOv7-E6	1280	56.0%	73.5%	61.2%	56 fps	12.3 ms
YOLOv7-D6	1280	56.6%	74.0%	61.8%	44 fps	15.0 ms
YOLOv7-E6E	1280	56.8%	74.4%	62.1%	36 fps	18.7 ms

Yolov5

Model	size ^(pixels)	mAP^val 0.5:0.95	mAP^val 0.5	Speed ^{CPU b1 (ms)}	Speed ^{V100 b1 (ms)}	Speed ^{V100 b32 (ms)}	params ^(M)	FLOPs ^{@640 (B)}
YOLOv5n	640	28.0	45.7	45	6.3	0.6	1.9	4.5
YOLOv5s	640	37.4	56.8	98	6.4	0.9	7.2	16.5
YOLOv5m	640	45.4	64.1	224	8.2	1.7	21.2	49.0
YOLOv5l	640	49.0	67.3	430	10.1	2.7	46.5	109.1
YOLOv5x	640	50.7	68.9	766	12.1	4.8	86.7	205.7

YOLOv5n6	1280	36.0	54.4	153	8.1	2.1	3.2	4.6
YOLOv5s6	1280	44.8	63.7	385	8.2	3.6	12.6	16.8
YOLOv5m6	1280	51.3	69.3	887	11.1	6.8	35.7	50.0
YOLOv5l6	1280	53.7	71.3	1784	15.8	10.5	76.8	111.4
YOLOv5x6 + [TTA][TTA]	1280 1536	55.0 55.8	72.7 72.7	3136 -	26.2 -	19.4 -	140.7 -	209.8 -

Yolor

Model	Test Size	AP^val	AP₅₀^val	AP₇₅^val	AP_S^val	AP_M^val	AP_L^val	FLOPs	weights	batch1 throughput
YOLOR-P6	1280	52.5%	70.6%	57.4%	37.4%	57.3%	65.2%	326G	yolor-p6.pt	76 fps
YOLOR-W6	1280	54.0%	72.1%	59.1%	38.1%	58.8%	67.0%	454G	yolor-w6.pt	66 fps
YOLOR-E6	1280	54.6%	72.5%	59.8%	39.9%	59.0%	67.9%	684G	yolor-e6.pt	45 fps
YOLOR-D6	1280	55.4%	73.5%	60.6%	40.4%	60.1%	68.7%	937G	yolor-d6.pt	34 fps

YOLOR-S	640	40.7%	59.8%	44.2%	24.3%	45.7%	53.6%	21G
YOLOR-S_DWT	640	40.6%	59.4%	43.8%	23.4%	45.8%	53.4%	21G
YOLOR-S²_DWT	640	39.9%	58.7%	43.3%	21.7%	44.9%	53.4%	20G
YOLOR-S³_S2D	640	39.3%	58.2%	42.4%	21.3%	44.6%	52.6%	18G
YOLOR-S³_DWT	640	39.4%	58.3%	42.5%	21.7%	44.3%	53.0%	18G
YOLOR-S⁴_S2D	640	36.9%	55.3%	39.7%	18.1%	41.9%	50.4%	16G	weights
YOLOR-S⁴_DWT	640	37.0%	55.3%	39.9%	18.4%	41.9%	51.0%	16G	weights

Comparison of all yolo models

Label and image format of yolo models

Yolov7 and Yolov5 labels should be in text format and images should be in jpg format.

Detection - RecycleAI/RecycleIT-A GitHub Wiki

Detection Metrics

Detection Models

Detection Models specifications

Yolov7

Yolov5

Yolor

Comparison of all yolo models

Label and image format of yolo models

⚠️ GitHub.com Fallback ⚠️

Detection - RecycleAI/RecycleIT-A GitHub Wiki

Detection Metrics

Detection Models

Detection Models specifications

Yolov7

Yolov5

Yolor

Comparison of all yolo models

Label and image format of yolo models

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️